1 \input texinfo @c -*-texinfo-*-
4 @settitle Finding Files
5 @c For double-sided printing, uncomment:
6 @c @setchapternewpage odd
17 * Finding files: (find). Operating on files matching certain criteria.
20 @dircategory Individual utilities
22 * find: (find)Invoking find. Finding and acting on files.
23 * locate: (find)Invoking locate. Finding files in a database.
24 * updatedb: (find)Invoking updatedb. Building the locate database.
25 * xargs: (find)Invoking xargs. Operating on many files.
30 This file documents the GNU utilities for finding files that match
31 certain criteria and performing various operations on them.
33 Copyright (C) 1994, 1996, 1998, 2000, 2001, 2003, 2004, 2005 Free
34 Software Foundation, Inc.
36 Permission is granted to make and distribute verbatim copies of
37 this manual provided the copyright notice and this permission notice
38 are preserved on all copies.
41 Permission is granted to process this file through TeX and print the
42 results, provided the printed document carries copying permission
43 notice identical to this one except for the removal of this paragraph
44 (this paragraph not being relevant to the printed manual).
47 Permission is granted to copy and distribute modified versions of this
48 manual under the conditions for verbatim copying, provided that the entire
49 resulting derived work is distributed under the terms of a permission
50 notice identical to this one.
52 Permission is granted to copy and distribute translations of this manual
53 into another language, under the above conditions for modified versions,
54 except that this permission notice may be stated in a translation approved
60 @subtitle Edition @value{EDITION}, for GNU @code{find} version @value{VERSION}
61 @subtitle @value{UPDATED}
62 @author by David MacKenzie
65 @vskip 0pt plus 1filll
72 @node Top, Introduction, , (dir)
73 @comment node-name, next, previous, up
75 This file documents the GNU utilities for finding files that match
76 certain criteria and performing various actions on them.
77 This is edition @value{EDITION}, for @code{find} version @value{VERSION}.
80 @c The master menu, created with texinfo-master-menu, goes here.
83 * Introduction:: Summary of the tasks this manual describes.
84 * Finding Files:: Finding files that match certain criteria.
85 * Actions:: Doing things to files you have found.
86 * Common Tasks:: Solutions to common real-world problems.
87 * Databases:: Maintaining file name databases.
88 * File Permissions:: How to control access to files.
89 * Reference:: Summary of how to invoke the programs.
90 * Security Considerations:: Security issues relating to findutils.
91 * Error Messages:: Explanations of some messages you might see.
92 * Primary Index:: The components of @code{find} expressions.
95 @node Introduction, Finding Files, Top, Top
98 This manual shows how to find files that meet criteria you specify, and
99 how to perform various actions on the files that you find. The
100 principal programs that you use to perform these tasks are @code{find},
101 @code{locate}, and @code{xargs}. Some of the examples in this manual
102 use capabilities specific to the GNU versions of those programs.
104 GNU @code{find} was originally written by Eric Decker, with enhancements
105 by David MacKenzie, Jay Plett, and Tim Wood. GNU @code{xargs} was
106 originally written by Mike Rendell, with enhancements by David
107 MacKenzie. GNU @code{locate} and its associated utilities were
108 originally written by James Woods, with enhancements by David MacKenzie.
109 The idea for @samp{find -print0} and @samp{xargs -0} came from Dan
110 Bernstein. The current maintainer of GNU findutils (and this manual) is
111 James Youngman. Many other people have contributed bug fixes, small
112 improvements, and helpful suggestions. Thanks!
114 Mail suggestions and bug reports for these programs to
115 @code{bug-findutils@@gnu.org}. Please include the version
116 number, which you can get by running @samp{find --version}.
127 For brevity, the word @dfn{file} in this manual means a regular file, a
128 directory, a symbolic link, or any other kind of node that has a
129 directory entry. A directory entry is also called a @dfn{file name}. A
130 file name may contain some, all, or none of the directories in a path
131 that leads to the file. These are all examples of what this manual
132 calls ``file names'':
139 /usr/local/include/termcap.h
142 A @dfn{directory tree} is a directory and the files it contains, all of
143 its subdirectories and the files they contain, etc. It can also be a
144 single non-directory file.
146 These programs enable you to find the files in one or more directory
151 have names that contain certain text or match a certain pattern;
153 are links to certain files;
155 were last used during a certain period of time;
157 are within a certain size range;
159 are of a certain type (regular file, directory, symbolic link, etc.);
161 are owned by a certain user or group;
163 have certain access permissions;
165 contain text that matches a certain pattern;
167 are within a certain depth in the directory tree;
169 or some combination of the above.
172 Once you have found the files you're looking for (or files that are
173 potentially the ones you're looking for), you can do more to them than
174 simply list their names. You can get any combination of the files'
175 attributes, or process the files in many ways, either individually or in
176 groups of various sizes. Actions that you might want to perform on the
177 files you have found include, but are not limited to:
187 change access permissions
192 This manual describes how to perform each of those tasks, and more.
197 The principal programs used for making lists of files that match given
198 criteria and running commands on them are @code{find}, @code{locate},
199 and @code{xargs}. An additional command, @code{updatedb}, is used by
200 system administrators to create databases for @code{locate} to use.
202 @code{find} searches for files in a directory hierarchy and prints
203 information about the files it found. It is run like this:
206 find @r{[}@var{file}@dots{}@r{]} @r{[}@var{expression}@r{]}
210 Here is a typical use of @code{find}. This example prints the names of
211 all files in the directory tree rooted in @file{/usr/src} whose name
212 ends with @samp{.c} and that are larger than 100 Kilobytes.
214 find /usr/src -name '*.c' -size +100k -print
217 Notice that the wildcard must be enclosed in quotes in order to
218 protect it from expansion by the shell.
220 @code{locate} searches special file name databases for file names that
221 match patterns. The system administrator runs the @code{updatedb}
222 program to create the databases. @code{locate} is run like this:
225 locate @r{[}@var{option}@dots{}@r{]} @var{pattern}@dots{}
229 This example prints the names of all files in the default file name
230 database whose name ends with @samp{Makefile} or @samp{makefile}. Which
231 file names are stored in the database depends on how the system
232 administrator ran @code{updatedb}.
234 locate '*[Mm]akefile'
237 The name @code{xargs}, pronounced EX-args, means ``combine arguments.''
238 @code{xargs} builds and executes command lines by gathering together
239 arguments it reads on the standard input. Most often, these arguments
240 are lists of file names generated by @code{find}. @code{xargs} is run
244 xargs @r{[}@var{option}@dots{}@r{]} @r{[}@var{command} @r{[}@var{initial-arguments}@r{]}@r{]}
248 The following command searches the files listed in the file
249 @file{file-list} and prints all of the lines in them that contain the
252 xargs grep typedef < file-list
255 @node find Expressions
256 @section @code{find} Expressions
258 The expression that @code{find} uses to select files consists of one or
259 more @dfn{primaries}, each of which is a separate command line argument
260 to @code{find}. @code{find} evaluates the expression each time it
261 processes a file. An expression can contain any of the following types
266 affect overall operation rather than the processing of a specific file;
268 return a true or false value, depending on the file's attributes;
270 have side effects and return a true or false value; and
272 connect the other arguments and affect when and whether they are
276 You can omit the operator between two primaries; it defaults to
277 @samp{-and}. @xref{Combining Primaries With Operators}, for ways to
278 connect primaries into more complex expressions. If the expression
279 contains no actions other than @samp{-prune}, @samp{-print} is performed
280 on all files for which the entire expression is true (@pxref{Print File
283 Options take effect immediately, rather than being evaluated for each
284 file when their place in the expression is reached. Therefore, for
285 clarity, it is best to place them at the beginning of the expression.
287 Many of the primaries take arguments, which immediately follow them in
288 the next command line argument to @code{find}. Some arguments are file
289 names, patterns, or other strings; others are numbers. Numeric
290 arguments can be specified as
294 for greater than @var{n},
296 for less than @var{n},
301 @node Finding Files, Actions, Introduction, Top
302 @chapter Finding Files
304 By default, @code{find} prints to the standard output the names of the
305 files that match the given criteria. @xref{Actions}, for how to get more
306 information about the matching files.
320 * Combining Primaries With Operators::
326 Here are ways to search for files whose name matches a certain pattern.
327 @xref{Shell Pattern Matching}, for a description of the @var{pattern}
328 arguments to these tests.
330 Each of these tests has a case-sensitive version and a case-insensitive
331 version, whose name begins with @samp{i}. In a case-insensitive
332 comparison, the patterns @samp{fo*} and @samp{F??} match the file names
333 @file{Foo}, @samp{FOO}, @samp{foo}, @samp{fOo}, etc.
336 * Base Name Patterns::
337 * Full Name Patterns::
338 * Fast Full Name Search::
339 * Shell Pattern Matching:: Wildcards used by these programs.
342 @node Base Name Patterns
343 @subsection Base Name Patterns
345 @deffn Test -name pattern
346 @deffnx Test -iname pattern
347 True if the base of the file name (the path with the leading directories
348 removed) matches shell pattern @var{pattern}. For @samp{-iname}, the
349 match is case-insensitive. To ignore a whole directory tree, use
350 @samp{-prune} (@pxref{Directories}). As an example, to find Texinfo
351 source files in @file{/usr/local/doc}:
354 find /usr/local/doc -name '*.texi'
358 Notice that the wildcard must be enclosed in quotes in order to
359 protect it from expansion by the shell.
361 Patterns for @samp{-name} and @samp{-iname} will match a filename with
362 a leading @samp{.}. For example the command @samp{find /tmp -name
363 \*bar} will match the file @file{/tmp/.foobar}.
366 @node Full Name Patterns
367 @subsection Full Name Patterns
369 @deffn Test -wholename pattern
370 @deffnx Test -iwholename pattern
371 True if the entire file name, starting with the command line argument
372 under which the file was found, matches shell pattern @var{pattern}.
373 For @samp{-iwholename}, the match is case-insensitive. To ignore a
374 whole directory tree, use @samp{-prune} rather than checking every
375 file in the tree (@pxref{Directories}). The ``entire file name'' as
376 used by find starts with the starting-point specified on the command
377 line, and is not converted to an absolute pathname, so for example
378 @code{cd /; find tmp -wholename /tmp} will never match anything.
381 @deffn Test -path pattern
382 @deffnx Test -ipath pattern
383 These tests are deprecated, but work as for @samp{-wholename} and @samp{-iwholename},
384 respectively. The @samp{-ipath} test is a GNU extension, but @samp{-path} is also
385 provided by HP-UX @code{find}.
388 @deffn Test -regex expr
389 @deffnx Test -iregex expr
390 True if the entire file name matches regular expression @var{expr}.
391 This is a match on the whole path, not a search. For example, to match
392 a file named @file{./fubar3}, you can use the regular expression
393 @samp{.*bar.} or @samp{.*b.*3}, but not @samp{f.*r3}. @xref{Regexps, ,
394 Syntax of Regular Expressions, emacs, The GNU Emacs Manual}, for a
395 description of the syntax of regular expressions. For @samp{-iregex},
396 the match is case-insensitive.
399 @node Fast Full Name Search
400 @subsection Fast Full Name Search
402 To search for files by name without having to actually scan the
403 directories on the disk (which can be slow), you can use the
404 @code{locate} program. For each shell pattern you give it,
405 @code{locate} searches one or more databases of file names and displays
406 the file names that contain the pattern. @xref{Shell Pattern Matching},
407 for details about shell patterns.
409 If a pattern is a plain string---it contains no
410 metacharacters---@code{locate} displays all file names in the database
411 that contain that string. If a pattern contains
412 metacharacters, @code{locate} only displays file names that match the
413 pattern exactly. As a result, patterns that contain metacharacters
414 should usually begin with a @samp{*}, and will most often end with one
415 as well. The exceptions are patterns that are intended to explicitly
416 match the beginning or end of a file name.
418 If you only want @code{locate} to match against the last component of
419 the filenames (the ``base name'' of the files) you can use the
420 @samp{--basename} option. The opposite behaviour is the default, but
421 can be selected explicitly by using the option @samp{--wholename}.
428 is almost equivalent to
430 find @var{directories} -name @var{pattern}
433 where @var{directories} are the directories for which the file name
434 databases contain information. The differences are that the
435 @code{locate} information might be out of date, and that @code{locate}
436 handles wildcards in the pattern slightly differently than @code{find}
437 (@pxref{Shell Pattern Matching}).
439 The file name databases contain lists of files that were on the system
440 when the databases were last updated. The system administrator can
441 choose the file name of the default database, the frequency with which
442 the databases are updated, and the directories for which they contain
445 Here is how to select which file name databases @code{locate} searches.
446 The default is system-dependent.
449 @item --database=@var{path}
451 Instead of searching the default file name database, search the file
452 name databases in @var{path}, which is a colon-separated list of
453 database file names. You can also use the environment variable
454 @code{LOCATE_PATH} to set the list of database files to search. The
455 option overrides the environment variable if both are used.
458 @node Shell Pattern Matching
459 @subsection Shell Pattern Matching
461 @code{find} and @code{locate} can compare file names, or parts of file
462 names, to shell patterns. A @dfn{shell pattern} is a string that may
463 contain the following special characters, which are known as
464 @dfn{wildcards} or @dfn{metacharacters}.
466 You must quote patterns that contain metacharacters to prevent the shell
467 from expanding them itself. Double and single quotes both work; so does
468 escaping with a backslash.
472 Matches any zero or more characters.
475 Matches any one character.
478 Matches exactly one character that is a member of the string
479 @var{string}. This is called a @dfn{character class}. As a shorthand,
480 @var{string} may contain ranges, which consist of two characters with a
481 dash between them. For example, the class @samp{[a-z0-9_]} matches a
482 lowercase letter, a number, or an underscore. You can negate a class by
483 placing a @samp{!} or @samp{^} immediately after the opening bracket.
484 Thus, @samp{[^A-Z@@]} matches any character except an uppercase letter
488 Removes the special meaning of the character that follows it. This
489 works even in character classes.
492 In the @code{find} tests that do shell pattern matching (@samp{-name},
493 @samp{-wholename}, etc.), wildcards in the pattern will match a @samp{.}
494 at the beginning of a file name. This is also the case for
495 @code{locate}. Thus, @samp{find -name '*macs'} will match a file
496 named @file{.emacs}, as will @samp{locate '*macs'}.
498 Slash characters have no special significance in the shell pattern
499 matching that @code{find} and @code{locate} do, unlike in the shell, in
500 which wildcards do not match them. Therefore, a pattern @samp{foo*bar}
501 can match a file name @samp{foo3/bar}, and a pattern @samp{./sr*sc} can
502 match a file name @samp{./src/misc}.
504 If you want to locate some files with the @samp{locate} command but
505 don't need to see the full list you can use the @samp{--limit} option
506 to see just a small number of results, or the @samp{--count} option to
507 display only the total number of matches.
512 There are two ways that files can be linked together. @dfn{Symbolic
513 links} are a special type of file whose contents are a portion of the
514 name of another file. @dfn{Hard links} are multiple directory entries
515 for one file; the file names all have the same index node (@dfn{inode})
524 @subsection Symbolic Links
526 Symbolic links are names that reference other files. GNU @code{find}
527 will handle symbolic links in one of two ways; firstly, it can
528 dereference the links for you - this means that if it comes across a
529 symbolic link, it examines the file that the link points to, in order
530 to see if it matches the criteria you have specified. Secondly, it
531 can check the link itself in case you might be looking for the actual
532 link. If the file that the symbolic link points to is also within the
533 directory hierarchy you are searching with the @code{find} command,
534 you may not see a great deal of difference between these two
537 By default, @code{find} examines symbolic links themselves when it
538 finds them (and, if it later comes across the linked-to file, it will
539 examine that, too). If you would prefer @code{find} to dereference
540 the links and examine the file that each link points to, specify the
541 @samp{-L} option to @code{find}. You can explicitly specify the
542 default behaviour by using the @samp{-P} option. The @samp{-H}
543 option is a half-way-between option which ensures that any symbolic
544 links listed on the command line are dereferenced, but other symbolic
547 Symbolic links are different to ``hard links'' in the sense that you
548 need permissions upon the linked-to file in order to be able to
549 dereference the link. This can mean that even if you specify the
550 @samp{-L} option, find may not be able to determine the properties of
551 the file that the link points to (because you don't have sufficient
552 permissions). In this situation, @samp{find} uses the properties of
553 the link itself. This also occurs if a symbolic link exists but
554 points to a file that is missing.
556 The options controlling the behaviour of @code{find} with respect to
557 links are as follows :-
561 @code{find} does not dereference symbolic links at all. This is the
562 default behaviour. This option must be specified before any of the
563 path names on the command line.
565 @code{find} does not dereference symbolic links (except in the case of
566 file names on the command line, which are dereferenced). If a
567 symbolic link cannot be dereferenced, the information for the symbolic
568 link itself is used. This option must be specified before any of the
569 path names on the command line.
571 @code{find} dereferences symbolic links where possible, and where this
572 is not possible it uses the properties of the symbolic link itself.
573 This option must be specified before any of the path names on the
574 command line. Use of this option also implies the same behaviour as
575 the @samp{-noleaf} option. If you later use the @samp{-H} or
576 @samp{-P} options, this does not turn off @samp{-noleaf}.
579 This option forms part of the ``expression'' and must be specified
580 after the path names, but it is otherwise equivalent to @samp{-L}.
583 The following differences in behavior occur when the @samp{-L} option
588 @code{find} follows symbolic links to directories when searching
591 @samp{-lname} and @samp{-ilname} always return false (unless they
592 happen to match broken symbolic links).
594 @samp{-type} reports the types of the files that symbolic links point
597 Implies @samp{-noleaf} (@pxref{Directories}).
600 If the @samp{-L} option or the @samp{-H} option is used,
601 the filenames used as arguments to @samp{-newer}, @samp{-anewer}, and
602 @samp{-cnewer} are dereferenced and the timestamp from the pointed-to
603 file is used instead (if possible -- otherwise the timestamp from the
604 symbolic link is used).
606 @deffn Test -lname pattern
607 @deffnx Test -ilname pattern
608 True if the file is a symbolic link whose contents match shell pattern
609 @var{pattern}. For @samp{-ilname}, the match is case-insensitive.
610 @xref{Shell Pattern Matching}, for details about the @var{pattern}
611 argument. If the @samp{-L} option is in effect, this test will always
612 fail for symbolic links unless they are broken. So, to list any
613 symbolic links to @file{sysdep.c} in the current directory and its
614 subdirectories, you can do:
617 find . -lname '*sysdep.c'
622 @subsection Hard Links
624 Hard links allow more than one name to refer to the same file. To
625 find all the names which refer to the same file as NAME, use
626 @samp{-samefile NAME}. If you are not using the @samp{-L} option, you
627 can confine your search to one filesystem using the @samp{-xdev}
628 option. This is useful because hard links cannot point outside a
629 single filesystem, so this can cut down on needless searching.
631 If the @samp{-L} option is in effect, and NAME is in fact a symbolic
632 link, the symbolic link will be dereferenced. Hence you are searching
633 for other links (hard or symbolic) to the file pointed to by NAME. If
634 @samp{-L} is in effect but NAME is not itself a symbolic link, other
635 symbolic links to the file NAME will be matched.
637 You can also search for files by inode number. This can occasionally
638 be useful in diagnosing problems with filesystems for example, because
639 @code{fsck} tends to print inode numbers. Inode numbers also
640 occasionally turn up in log messages for some types of software, and
641 are used to support the @code{ftok()} library function.
643 You can learn a file's inode number and the number of links to it by
644 running @samp{ls -li} or @samp{find -ls}.
646 You can search for hard links to inode number NUM by using @samp{-inum
647 NUM}. If there are any file system mount points below the directory
648 where you are starting the search, use the @samp{-xdev} option unless
649 you are also using the @samp{-L} option. Using @samp{-xdev} this
650 saves needless searching, since hard links to a file must be on the
651 same filesystem. @xref{Filesystems}.
653 @deffn Test -samefile NAME
654 File is a hard link to the same inode as NAME. If the @samp{-L}
655 option is in effect, symbolic links to the same file as NAME points to
660 File has inode number @var{n}. The @samp{+} and @samp{-} qualifiers
661 also work, though these are rarely useful.
664 You can also search for files that have a certain number of links, with
665 @samp{-links}. Directories normally have at least two hard links; their
666 @file{.} entry is the second one. If they have subdirectories, each of
667 those also has a hard link called @file{..} to its parent directory.
668 The @file{.} and @file{..} directory entries are not normally searched
669 unless they are mentioned on the @code{find} command line.
672 File has @var{n} hard links.
675 @deffn Test -links +n
676 File has more than @var{n} hard links.
679 @deffn Test -links -n
680 File has fewer than @var{n} hard links.
686 Each file has three time stamps, which record the last time that certain
687 operations were performed on the file:
691 access (read the file's contents)
693 change the status (modify the file or its attributes)
695 modify (change the file's contents)
698 There is no timestamp that indicates when a file was @emph{created}.
700 You can search for files whose time stamps are within a certain age
701 range, or compare them to other time stamps.
705 * Comparing Timestamps::
709 @subsection Age Ranges
711 These tests are mainly useful with ranges (@samp{+@var{n}} and
715 @deffnx Test -ctime n
716 @deffnx Test -mtime n
717 True if the file was last accessed (or its status changed, or it was
718 modified) @var{n}*24 hours ago. The number of 24-hour periods since
719 the file's timestamp is always rounded down; therefore 0 means ``less
720 than 24 hours ago'', 1 means ``between 24 and 48 hours ago'', and so
727 True if the file was last accessed (or its status changed, or it was
728 modified) @var{n} minutes ago. These tests provide finer granularity of
729 measurement than @samp{-atime} et al., but rounding is done in a
730 similar way. For example, to list files in
731 @file{/u/bill} that were last read from 2 to 6 minutes ago:
734 find /u/bill -amin +2 -amin -6
738 @deffn Option -daystart
739 Measure times from the beginning of today rather than from 24 hours ago.
740 So, to list the regular files in your home directory that were modified
744 find ~ -daystart -type f -mtime 1
748 The @samp{-daystart} option is unlike most other options in that it
749 has an effect on the way that other tests are performed. The affected
750 tests are @samp{-amin}, @samp{-cmin}, @samp{-mmin}, @samp{-atime},
751 @samp{-ctime} and @samp{-mtime}.
753 @node Comparing Timestamps
754 @subsection Comparing Timestamps
756 As an alternative to comparing timestamps to the current time, you can
757 compare them to another file's timestamp. That file's timestamp could
758 be updated by another program when some event occurs. Or you could set
759 it to a particular fixed date using the @code{touch} command. For
760 example, to list files in @file{/usr} modified after February 1 of the
763 @c Idea from Rick Sladkey.
765 touch -t 02010000 /tmp/stamp$$
766 find /usr -newer /tmp/stamp$$
770 @deffn Test -anewer file
771 @deffnx Test -cnewer file
772 @deffnx Test -newer file
773 True if the file was last accessed (or its status changed, or it was
774 modified) more recently than @var{file} was modified. These tests are
775 affected by @samp{-follow} only if @samp{-follow} comes before them on
776 the command line. @xref{Symbolic Links}, for more information on
777 @samp{-follow}. As an example, to list any files modified since
778 @file{/bin/sh} was last modified:
781 find . -newer /bin/sh
786 True if the file was last accessed @var{n} days after its status was
787 last changed. Useful for finding files that are not being used, and
788 could perhaps be archived or removed to save disk space.
794 @deffn Test -size n@r{[}bckwMG@r{]}
795 True if the file uses @var{n} units of space, rounding up. The units
796 are 512-byte blocks by default, but they can be changed by adding a
797 one-character suffix to @var{n}:
801 512-byte blocks (never 1024)
805 kilobytes (1024 bytes)
814 The `b' suffix always considers blocks to be 512 bytes. This is not
815 affected by the setting (or non-setting) of the POSIXLY_CORRECT
816 environment variable. This behaviour is different to the behaviour of
817 the @samp{-ls} action). If you want to use 1024-byte units, use the
820 The number can be prefixed with a `+' or a `-'. A plus sign indicates
821 that the test should succeed if the file uses at least @var{n} units
822 of storage (this is the way I normally use this test) and a minus sign
823 indicates that the test should succeed if the file uses less than
824 @var{n} units of storage. There is no `=' prefix, because that's the
827 The size does not count indirect blocks, but it does count blocks in
828 sparse files that are not actually allocated. In other words, it's
829 consistent with the result you get for @samp{ls -l} or @samp{wc -c}.
830 This handling of sparse files differs from the output of the @samp{%k}
831 and @samp{%b} format specifiers for the @samp{-printf} predicate.
836 True if the file is empty and is either a regular file or a directory.
837 This might make it a good candidate for deletion. This test is useful
838 with @samp{-depth} (@pxref{Directories}) and @samp{-delete}
839 (@pxref{Single File}).
846 True if the file is of type @var{c}:
850 block (buffered) special
852 character (unbuffered) special
869 The same as @samp{-type} unless the file is a symbolic link. For
870 symbolic links: if @samp{-follow} has not been given, true if the file
871 is a link to a file of type @var{c}; if @samp{-follow} has been given,
872 true if @var{c} is @samp{l}. In other words, for symbolic links,
873 @samp{-xtype} checks the type of the file that @samp{-type} does not
874 check. @xref{Symbolic Links}, for more information on @samp{-follow}.
880 @deffn Test -user uname
881 @deffnx Test -group gname
882 True if the file is owned by user @var{uname} (belongs to group @var{gname}).
883 A numeric ID is allowed.
888 True if the file's numeric user ID (group ID) is @var{n}. These tests
889 support ranges (@samp{+@var{n}} and @samp{-@var{n}}), unlike
890 @samp{-user} and @samp{-group}.
894 @deffnx Test -nogroup
895 True if no user corresponds to the file's numeric user ID (no group
896 corresponds to the numeric group ID). These cases usually mean that the
897 files belonged to users who have since been removed from the system.
898 You probably should change the ownership of such files to an existing
899 user or group, using the @code{chown} or @code{chgrp} program.
905 @xref{File Permissions}, for information on how file permissions are
906 structured and how to specify them.
908 @deffn Test -perm mode
910 True if the file's permissions are exactly @var{mode} (which can be
911 numeric or symbolic).
913 If @var{mode} starts with @samp{-}, true if
914 @emph{all} of the permissions set in @var{mode} are set for the file;
915 permissions not set in @var{mode} are ignored.
916 If @var{mode} starts with @samp{/}, true if
917 @emph{any} of the permissions set in @var{mode} are set for the file;
918 permissions not set in @var{mode} are ignored.
919 This is a GNU extension.
921 If you don't use the @samp{/} or @samp{-} form with a symbolic mode
922 string, you may have to specify a rather complex mode string. For
923 example @samp{-perm g=w} will only match files which have mode 0020
924 (that is, ones for which group write permission is the only permission
925 set). It is more likely that you will want to use the @samp{/} or
926 @samp{-} forms, for example @samp{-perm -g=w}, which matches any file
927 with group write permission.
932 Match files which have read and write permission for their owner,
933 and group, but which the rest of the world can read but not write to.
934 Files which meet these criteria but have other permissions bits set
935 (for example if someone can execute the file) will not be matched.
938 Match files which have read and write permission for their owner,
939 and group, but which the rest of the world can read but not write to,
940 without regard to the presence of any extra permission bits (for
941 example the executable bit). This will match a file which has mode
945 Match files which are writeable by somebody (their owner, or
946 their group, or anybody else).
949 Match files which are writeable by either their owner or their
950 group. The files don't have to be writeable by both the owner and
951 group to be matched; either will do.
960 Search for files which are writeable by both their owner and their
971 To search for files based on their contents, you can use the @code{grep}
972 program. For example, to find out which C source files in the current
973 directory contain the string @samp{thing}, you can do:
979 If you also want to search for the string in files in subdirectories,
980 you can combine @code{grep} with @code{find} and @code{xargs}, like
984 find . -name '*.[ch]' | xargs grep -l thing
987 The @samp{-l} option causes @code{grep} to print only the names of files
988 that contain the string, rather than the lines that contain it. The
989 string argument (@samp{thing}) is actually a regular expression, so it
990 can contain metacharacters. This method can be refined a little by
991 using the @samp{-r} option to make @code{xargs} not run @code{grep} if
992 @code{find} produces no output, and using the @code{find} action
993 @samp{-print0} and the @code{xargs} option @samp{-0} to avoid
994 misinterpreting files whose names contain spaces:
997 find . -name '*.[ch]' -print0 | xargs -r -0 grep -l thing
1000 For a fuller treatment of finding files whose contents match a pattern,
1001 see the manual page for @code{grep}.
1004 @section Directories
1006 Here is how to control which directories @code{find} searches, and how
1007 it searches them. These two options allow you to process a horizontal
1008 slice of a directory tree.
1010 @deffn Option -maxdepth levels
1011 Descend at most @var{levels} (a non-negative integer) levels of
1012 directories below the command line arguments. @samp{-maxdepth 0} means
1013 only apply the tests and actions to the command line arguments.
1016 @deffn Option -mindepth levels
1017 Do not apply any tests or actions at levels less than @var{levels} (a
1018 non-negative integer). @samp{-mindepth 1} means process all files
1019 except the command line arguments.
1022 @deffn Option -depth
1023 Process each directory's contents before the directory itself. Doing
1024 this is a good idea when producing lists of files to archive with
1025 @code{cpio} or @code{tar}. If a directory does not have write
1026 permission for its owner, its contents can still be restored from the
1027 archive since the directory's permissions are restored after its contents.
1031 This is a deprecated synonym for @samp{-depth}, for compatibility with
1032 Mac OS X, FreeBSD and OpenBSD. The @samp{-depth} option is a POSIX
1033 feature, so it is better to use that.
1036 @deffn Action -prune
1037 If the file is a directory, do not descend into it. The result is
1038 true. For example, to skip the directory @file{src/emacs} and all
1039 files and directories under it, and print the names of the other files
1043 find . -wholename './src/emacs' -prune -o -print
1046 The above command will not print @file{./src/emacs} among its list of
1047 results. This however is not due to the effect of the @samp{-prune}
1048 action (which only prevents further descent, it doesn't make sure we
1049 ignore that item). Instead, this effect is due to the use of
1050 @samp{-o}. Since the left hand side of the ``or'' condition has
1051 succeeded for @file{./src/emacs}, it is not necessary to evaluate the
1052 right-hand-side (@samp{-print}) at all for this particular file. If
1053 you wanted to print that directory name you could use either an extra
1054 @samp{-print} action:
1057 find . -wholename './src/emacs' -prune -print -o -print
1060 or use the comma operator:
1063 find . -wholename './src/emacs' -prune , -print
1066 If the @samp{-depth} option is in effect, the subdirectories will have
1067 already been visited in any case. Hence @samp{-prune} has no effect
1073 Exit immediately (with return value zero if no errors have occurred).
1074 No child processes will be left running, but no more paths specified
1075 on the command line will be processed. For example, @code{find
1076 /tmp/foo /tmp/bar -print -quit} will print only @samp{/tmp/foo}.
1079 @deffn Option -noleaf
1080 Do not optimize by assuming that directories contain 2 fewer
1081 subdirectories than their hard link count. This option is needed when
1082 searching filesystems that do not follow the Unix directory-link
1083 convention, such as CD-ROM or MS-DOS filesystems or AFS volume mount
1084 points. Each directory on a normal Unix filesystem has at least 2 hard
1085 links: its name and its @file{.} entry. Additionally, its
1086 subdirectories (if any) each have a @file{..} entry linked to that
1087 directory. When @code{find} is examining a directory, after it has
1088 statted 2 fewer subdirectories than the directory's link count, it knows
1089 that the rest of the entries in the directory are non-directories
1090 (@dfn{leaf} files in the directory tree). If only the files' names need
1091 to be examined, there is no need to stat them; this gives a significant
1092 increase in search speed.
1095 @deffn Option -ignore_readdir_race
1096 If a file disappears after its name has been read from a directory but
1097 before @code{find} gets around to examining the file with @code{stat},
1098 don't issue an error message. If you don't specify this option, an
1099 error message will be issued. This option can be useful in system
1100 scripts (cron scripts, for example) that examine areas of the
1101 filesystem that change frequently (mail queues, temporary directories,
1102 and so forth), because this scenario is common for those sorts of
1103 directories. Completely silencing error messages from @code{find} is
1104 undesirable, so this option neatly solves the problem. There is no
1105 way to search one part of the filesystem with this option on and part
1106 of it with this option off, though.
1109 @deffn Option -noignore_readdir_race
1110 This option reverses the effect of the @samp{-ignore_readdir_race} option.
1115 @section Filesystems
1117 A @dfn{filesystem} is a section of a disk, either on the local host or
1118 mounted from a remote host over a network. Searching network
1119 filesystems can be slow, so it is common to make @code{find} avoid them.
1121 There are two ways to avoid searching certain filesystems. One way is
1122 to tell @code{find} to only search one filesystem:
1125 @deffnx Option -mount
1126 Don't descend directories on other filesystems. These options are synonyms.
1129 The other way is to check the type of filesystem each file is on, and
1130 not descend directories that are on undesirable filesystem types:
1132 @deffn Test -fstype type
1133 True if the file is on a filesystem of type @var{type}. The valid
1134 filesystem types vary among different versions of Unix; an incomplete
1135 list of filesystem types that are accepted on some version of Unix or
1138 ext2 ext3 proc sysfs ufs 4.2 4.3 nfs tmp mfs S51K S52K
1140 You can use @samp{-printf} with the @samp{%F} directive to see the types
1141 of your filesystems. The @samp{%D} directive shows the device number.
1142 @xref{Print File Information}. @samp{-fstype} is
1143 usually used with @samp{-prune} to avoid searching remote filesystems
1144 (@pxref{Directories}).
1147 @node Combining Primaries With Operators
1148 @section Combining Primaries With Operators
1150 Operators build a complex expression from tests and actions.
1151 The operators are, in order of decreasing precedence:
1154 @item @asis{( @var{expr} )}
1156 Force precedence. True if @var{expr} is true.
1158 @item @asis{! @var{expr}}
1159 @itemx @asis{-not @var{expr}}
1162 True if @var{expr} is false.
1164 @item @asis{@var{expr1 expr2}}
1165 @itemx @asis{@var{expr1} -a @var{expr2}}
1166 @itemx @asis{@var{expr1} -and @var{expr2}}
1169 And; @var{expr2} is not evaluated if @var{expr1} is false.
1171 @item @asis{@var{expr1} -o @var{expr2}}
1172 @itemx @asis{@var{expr1} -or @var{expr2}}
1175 Or; @var{expr2} is not evaluated if @var{expr1} is true.
1177 @item @asis{@var{expr1} , @var{expr2}}
1179 List; both @var{expr1} and @var{expr2} are always evaluated. True if
1180 @var{expr2} is true. The value of @var{expr1} is discarded. This
1181 operator lets you do multiple independent operations on one traversal,
1182 without depending on whether other operations succeeded. The two
1183 operations @var{expr1} and @var{expr2} are not always fully
1184 independent, since @var{expr1} might have side effects like touching
1185 or deleting files, or it might use @samp{-prune} which would also
1189 @code{find} searches the directory tree rooted at each file name by
1190 evaluating the expression from left to right, according to the rules of
1191 precedence, until the outcome is known (the left hand side is false for
1192 @samp{-and}, true for @samp{-or}), at which point @code{find} moves on
1193 to the next file name.
1195 There are two other tests that can be useful in complex expressions:
1205 @node Actions, Common Tasks, Finding Files, Top
1208 There are several ways you can print information about the files that
1209 match the criteria you gave in the @code{find} expression. You can
1210 print the information either to the standard output or to a file that
1211 you name. You can also execute commands that have the file names as
1212 arguments. You can use those commands as further filters to select files.
1216 * Print File Information::
1222 @node Print File Name
1223 @section Print File Name
1225 @deffn Action -print
1226 True; print the full file name on the standard output, followed by a
1230 @deffn Action -fprint file
1231 True; print the full file name into file @var{file}, followed by a
1232 newline. If @var{file} does not exist when @code{find} is run, it is
1233 created; if it does exist, it is truncated to 0 bytes. The file names
1234 @file{/dev/stdout} and @file{/dev/stderr} are handled specially; they
1235 refer to the standard output and standard error output, respectively.
1238 @node Print File Information
1239 @section Print File Information
1242 True; list the current file in @samp{ls -dils} format on the standard
1243 output. The output looks like this:
1246 204744 17 -rw-r--r-- 1 djm staff 17337 Nov 2 1992 ./lwall-quotes
1253 The inode number of the file. @xref{Hard Links}, for how to find files
1254 based on their inode number.
1257 the number of blocks in the file. The block counts are of 1K blocks,
1258 unless the environment variable @code{POSIXLY_CORRECT} is set, in which
1259 case 512-byte blocks are used. @xref{Size}, for how to find files based
1263 The file's type and permissions. The type is shown as a dash for a
1264 regular file; for other file types, a letter like for @samp{-type} is
1265 used (@pxref{Type}). The permissions are read, write, and execute for
1266 the file's owner, its group, and other users, respectively; a dash means
1267 the permission is not granted. @xref{File Permissions}, for more details
1268 about file permissions. @xref{Permissions}, for how to find files based
1269 on their permissions.
1272 The number of hard links to the file.
1275 The user who owns the file.
1281 The file's size in bytes.
1284 The date the file was last modified.
1287 The file's name. @samp{-ls} quotes non-printable characters in the file
1288 names using C-like backslash escapes.
1289 This may change soon, as the treatment of unprintable characters is
1290 harmonised for @samp{-ls}, @samp{-fls}, @samp{-print}, @samp{-fprint},
1291 @samp{-printf} and @samp{-fprintf}.
1295 @deffn Action -fls file
1296 True; like @samp{-ls} but write to @var{file} like @samp{-fprint}
1297 (@pxref{Print File Name}).
1300 @deffn Action -printf format
1301 True; print @var{format} on the standard output, interpreting @samp{\}
1302 escapes and @samp{%} directives. Field widths and precisions can be
1303 specified as with the @code{printf} C function. Format flags (like
1304 @samp{#} for example) may not work as you expect because many of the
1305 fields, even numeric ones, are printed with %s. This means though
1306 that the format flag @samp{-} will work; it forces left-alignment of
1307 the field. Unlike @samp{-print}, @samp{-printf} does not add a
1308 newline at the end of the string. If you want a newline at the end of
1309 the string, add a @samp{\n}.
1312 @deffn Action -fprintf file format
1313 True; like @samp{-printf} but write to @var{file} like @samp{-fprint}
1314 (@pxref{Print File Name}).
1319 * Format Directives::
1326 The escapes that @samp{-printf} and @samp{-fprintf} recognize are:
1334 Stop printing from this format immediately and flush the output.
1346 A literal backslash (@samp{\}).
1348 The character whose ASCII code is NNN (octal).
1351 A @samp{\} character followed by any other character is treated as an
1352 ordinary character, so they both are printed, and a warning message is
1353 printed to the standard error output (because it was probably a typo).
1355 @node Format Directives
1356 @subsection Format Directives
1358 @samp{-printf} and @samp{-fprintf} support the following format
1359 directives to print information about the file being processed. The C
1360 @code{printf} function, field width and precision specifiers are
1361 supported, as applied to string (%s) types. That is, you can specify
1362 "minimum field width"."maximum field width" for each directive.
1363 Format flags (like @samp{#} for example) may not work as you expect
1364 because many of the fields, even numeric ones, are printed with %s.
1365 The format flag @samp{-} does work; it forces left-alignment of the
1368 @samp{%%} is a literal percent sign. A @samp{%} character followed by
1369 an unrecognised character (i.e. not a known directive or printf field
1370 width and precision specifier), is discarded (but the unrecognised character
1371 is printed), and a warning message is printed to the standard error output
1372 (because it was probably a typo).
1376 * Ownership Directives::
1378 * Location Directives::
1380 * Formatting Flags::
1383 @node Name Directives
1384 @subsubsection Name Directives
1389 File's name (not the absolute path name, but the name of the file as
1390 it was encountered by find - that is, as a relative path from one of
1391 the starting points).
1393 File's name with any leading directories removed (only the last element).
1396 Leading directories of file's name (all but the last element and the
1397 slash before it). If the file's name contains no slashes (for example
1398 because it was named on the command line and is in the current working
1399 directory), then ``%h'' expands to ``.''. This prevents ``%h/%f''
1400 expanding to ``/foo'', which would be surprising and probably not
1404 File's name with the name of the command line argument under which
1405 it was found removed from the beginning.
1408 Command line argument under which file was found.
1412 @node Ownership Directives
1413 @subsubsection Ownership Directives
1418 File's group name, or numeric group ID if the group has no name.
1421 @c TODO: Needs to support # flag and 0 flag
1422 File's numeric group ID.
1425 File's user name, or numeric user ID if the user has no name.
1428 @c TODO: Needs to support # flag
1429 File's numeric user ID.
1431 @c full support, including # and 0.
1432 File's permissions (in octal). If you always want to have a leading
1433 zero on the number, use the '#' format flag, for example '%#m'.
1436 @node Size Directives
1437 @subsubsection Size Directives
1441 The amount of disk space used for this file in 1K blocks. Since disk space is
1442 allocated in multiples of the filesystem block size this is usually greater
1443 than %s/1024, but it can also be smaller if the file is a sparse file (that is,
1446 The amount of disk space used for this file in 512-byte blocks. Since disk
1447 space is allocated in multiples of the filesystem block size this is usually
1448 greater than %s/1024, but it can also be smaller if the file is a sparse file
1449 (that is, it has ``holes'').
1451 File's size in bytes.
1454 @node Location Directives
1455 @subsubsection Location Directives
1459 File's depth in the directory tree (depth below a file named on the
1460 command line, not depth below the root directory). Files named on the
1461 command line have a depth of 0. Subdirectories immediately below them
1462 have a depth of 1, and so on.
1464 The device number on which the file exists (the @code{st_dev} field of
1465 @code{struct stat}), in decimal.
1467 Type of the filesystem the file is on; this value can be used for
1468 @samp{-fstype} (@pxref{Directories}).
1470 Object of symbolic link (empty string if file is not a symbolic link).
1472 File's inode number (in decimal).
1474 Number of hard links to file.
1476 Type of the file as used with @samp{-type}. If the file is a symbolic
1477 link, @samp{l} will be printed.
1479 Type of the file as used with @samp{-type}. If the file is a symbolic
1480 link, it is dereferenced. If the file is a broken symbolic link,
1481 @samp{N} is printed.
1485 @node Time Directives
1486 @subsubsection Time Directives
1488 Some of these directives use the C @code{ctime} function. Its output
1489 depends on the current locale, but it typically looks like
1492 Wed Nov 2 00:42:36 1994
1497 File's last access time in the format returned by the C @code{ctime} function.
1499 File's last access time in the format specified by @var{k}
1500 (@pxref{Time Formats}).
1502 File's last status change time in the format returned by the C @code{ctime}
1505 File's last status change time in the format specified by @var{k}
1506 (@pxref{Time Formats}).
1508 File's last modification time in the format returned by the C @code{ctime}
1511 File's last modification time in the format specified by @var{k}
1512 (@pxref{Time Formats}).
1516 @subsection Time Formats
1518 Below are the formats for the directives @samp{%A}, @samp{%C}, and
1519 @samp{%T}, which print the file's timestamps. Some of these formats
1520 might not be available on all systems, due to differences in the C
1521 @code{strftime} function between systems.
1526 * Combined Time Formats::
1529 @node Time Components
1530 @subsubsection Time Components
1532 The following format directives print single components of the time.
1546 time zone (e.g., EDT), or nothing if no time zone is determinable
1552 seconds since Jan. 1, 1970, 00:00 GMT.
1555 @node Date Components
1556 @subsubsection Date Components
1558 The following format directives print single components of the date.
1562 locale's abbreviated weekday name (Sun..Sat)
1564 locale's full weekday name, variable length (Sunday..Saturday)
1567 locale's abbreviated month name (Jan..Dec)
1569 locale's full month name, variable length (January..December)
1573 day of month (01..31)
1577 day of year (001..366)
1579 week number of year with Sunday as first day of week (00..53)
1581 week number of year with Monday as first day of week (00..53)
1585 last two digits of year (00..99)
1588 @node Combined Time Formats
1589 @subsubsection Combined Time Formats
1591 The following format directives print combinations of time and date
1596 time, 12-hour (hh:mm:ss [AP]M)
1598 time, 24-hour (hh:mm:ss)
1600 locale's time representation (H:M:S)
1602 locale's date and time (Sat Nov 04 12:02:33 EST 1989)
1606 locale's date representation (mm/dd/yy)
1608 Date and time, separated by '+', for example `2004-04-28+22:22:05'.
1609 The time is given in the current timezone (which may be affected by
1610 setting the TZ environment variable). This is a GNU extension.
1613 @node Formatting Flags
1614 @subsubsection Formatting Flags
1616 The @samp{%m} and @samp{%d} directives support the @samp{#}, @samp{0}
1617 and @samp{+} flags, but the other directives do not, even if they
1618 print numbers. Numeric directives that do not support these flags
1628 All fields support the format flag @samp{-}, which makes fields
1629 left-aligned. That is, if the field width is greater than the actual
1630 contents of the field, the requisite number of spaces are printed
1631 after the field content instead of before it.
1634 @section Run Commands
1636 You can use the list of file names created by @code{find} or
1637 @code{locate} as arguments to other commands. In this way you can
1638 perform arbitrary actions on the files.
1647 @subsection Single File
1649 Here is how to run a command on one file at a time.
1651 @deffn Action -execdir command ;
1652 Execute @var{command}; true if 0 status is returned. @code{find} takes
1653 all arguments after @samp{-exec} to be part of the command until an
1654 argument consisting of @samp{;} is reached. It replaces the string
1655 @samp{@{@}} by the current file name being processed everywhere it
1656 occurs in the command. Both of these constructions need to be escaped
1657 (with a @samp{\}) or quoted to protect them from expansion by the shell.
1658 The command is executed in the directory in which @code{find} was run.
1660 For example, to compare each C header file in the current directory with
1661 the file @file{/tmp/master}:
1664 find . -name '*.h' -execdir diff -u '@{@}' /tmp/master ';'
1669 Another similar option, @samp{-exec} is supported, but is less secure.
1670 @xref{Security Considerations}, for a discussion of the security
1671 problems surrounding @samp{-exec}.
1674 @deffn Action -exec command ;
1675 This insecure variant of the @samp{-execdir} action is specified by
1676 POSIX. The main difference is that the command is executed in the
1677 directory from which @code{find} was invoked, meaning that @samp{@{@}}
1678 is expanded to a relative path starting with the name of one of the
1679 starting directories, rather than just the basename of the matched
1684 @node Multiple Files
1685 @subsection Multiple Files
1687 Sometimes you need to process files one of the time. But usually this
1688 is not necessary, and, it is faster to run a command on as many files
1689 as possible at a time, rather than once per file. Doing this saves on
1690 the time it takes to start up the command each time.
1692 The @samp{-execdir} and @samp{-exec} actions have variants that build
1693 command lines containing as many matched files as possible.
1695 @deffn Action -execdir command @{@} +
1696 This works as for @samp{-execdir command ;}, except that the
1697 @samp{@{@}} at the end of the command is expanded to a list of names
1698 of matching files. This expansion is done in such a way as to avoid
1699 exceeding the maximum command line length available on the system.
1700 Only one @samp{@{@}} is allowed within the command, and it must appear
1701 at the end, immediately before the @samp{+}. A @samp{+} appearing in
1702 any position other than immediately after @samp{@{@}} is not
1703 considered to be special (that is, it does not terminate the command).
1707 @deffn Action -exec command @{@} +
1708 This insecure variant of the @samp{-execdir} action is specified by
1709 POSIX. The main difference is that the command is executed in the
1710 directory from which @code{find} was invoked, meaning that @samp{@{@}}
1711 is expanded to a relative path starting with the name of one of the
1712 starting directories, rather than just the basename of the matched
1716 Before @code{find} exits, any partially-built command lines are
1717 executed. This happens even if the exit was caused by the
1718 @samp{-quit} action. However, some types of error (for example not
1719 being able to invoke @code{stat()} on the current directory) can cause
1720 an immediate fatal exit. In this situation, any partially-built
1721 command lines will not be invoked (this prevents possible infinite
1724 Another, but less secure, way to run a command on more than one file
1725 at once, is to use the @code{xargs} command, which is invoked like this:
1728 xargs @r{[}@var{option}@dots{}@r{]} @r{[}@var{command} @r{[}@var{initial-arguments}@r{]}@r{]}
1731 @code{xargs} normally reads arguments from the standard input. These
1732 arguments are delimited by blanks (which can be protected with double
1733 or single quotes or a backslash) or newlines. It executes the
1734 @var{command} (default is @file{/bin/echo}) one or more times with any
1735 @var{initial-arguments} followed by arguments read from standard
1736 input. Blank lines on the standard input are ignored.
1738 Instead of blank-delimited names, it is safer to use @samp{find -print0}
1739 or @samp{find -fprint0} and process the output by giving the @samp{-0}
1740 or @samp{--null} option to GNU @code{xargs}, GNU @code{tar}, GNU
1741 @code{cpio}, or @code{perl}. The @code{locate} command also has a
1742 @samp{-0} or @samp{--null} option which does the same thing.
1744 You can use shell command substitution (backquotes) to process a list of
1745 arguments, like this:
1748 grep -l sprintf `find $HOME -name '*.c' -print`
1751 However, that method produces an error if the length of the @samp{.c}
1752 file names exceeds the operating system's command-line length limit.
1753 @code{xargs} avoids that problem by running the command as many times as
1754 necessary without exceeding the limit:
1757 find $HOME -name '*.c' -print | xargs grep -l sprintf
1760 However, if the command needs to have its standard input be a terminal
1761 (@code{less}, for example), you have to use the shell command
1762 substitution method or use the @samp{--arg-file} option of
1765 The @code{xargs} command will process all its input, building command
1766 lines and executing them, unless one of the commands exits with a
1767 status of 255 (this will cause xargs to issue an error message and
1768 stop) or it reads a line contains the end of file string specified
1769 with the @samp{--eof} option.
1772 * Unsafe File Name Handling::
1773 * Safe File Name Handling::
1774 * Unusual Characters in File Names::
1775 * Limiting Command Size::
1776 * Interspersing File Names::
1779 @node Unsafe File Name Handling
1780 @subsubsection Unsafe File Name Handling
1782 Because file names can contain quotes, backslashes, blank characters,
1783 and even newlines, it is not safe to process them using @code{xargs} in its
1784 default mode of operation. But since most files' names do not contain
1785 blanks, this problem occurs only infrequently. If you are only
1786 searching through files that you know have safe names, then you need not
1787 be concerned about it.
1789 @c This example is adapted from:
1790 @c From: pfalstad@stone.Princeton.EDU (Paul John Falstad)
1791 @c Newsgroups: comp.unix.shell
1792 @c Subject: Re: Beware xargs security holes
1793 @c Date: 16 Oct 90 19:12:06 GMT
1795 In many applications, if @code{xargs} botches processing a file because
1796 its name contains special characters, some data might be lost. The
1797 importance of this problem depends on the importance of the data and
1798 whether anyone notices the loss soon enough to correct it. However,
1799 here is an extreme example of the problems that using blank-delimited
1800 names can cause. If the following command is run daily from
1801 @code{cron}, then any user can remove any file on the system:
1804 find / -name '#*' -atime +7 -print | xargs rm
1807 For example, you could do something like this:
1815 and then @code{cron} would delete @file{/vmunix}, if it ran
1816 @code{xargs} with @file{/} as its current directory.
1818 To delete other files, for example @file{/u/joeuser/.plan}, you could do
1826 eg$ mkdir u u/joeuser u/joeuser/.plan'
1828 eg$ echo > u/joeuser/.plan'
1831 eg$ find . -name '#*' -print | xargs echo
1832 ./# ./# /u/joeuser/.plan /#foo
1835 @node Safe File Name Handling
1836 @subsubsection Safe File Name Handling
1838 Here is how to make @code{find} output file names so that they can be
1839 used by other programs without being mangled or misinterpreted. You can
1840 process file names generated this way by giving the @samp{-0} or
1841 @samp{--null} option to GNU @code{xargs}, GNU @code{tar}, GNU
1842 @code{cpio}, or @code{perl}.
1844 @deffn Action -print0
1845 True; print the full file name on the standard output, followed by a
1849 @deffn Action -fprint0 file
1850 True; like @samp{-print0} but write to @var{file} like @samp{-fprint}
1851 (@pxref{Print File Name}).
1854 As of findutils version 4.2.4, the @code{locate} program also has a
1855 @samp{--null} option which does the same thing. For similarity with
1856 @code{xargs}, the short form of the option @samp{-0} can also be used.
1858 If you want to be able to handle file names safely but need to run
1859 commands which want to be connected to a terminal on their input, you
1860 can use the @samp{--arg-file} option to @code{xargs} like this:
1863 find / -name xyzzy -print0 > list
1864 xargs --null --arg-file=list munge
1867 The example above runs the @code{munge} program on all the files named
1868 @file{xyzzy} that we can find, but @code{munge}'s input will still be
1869 the terminal (or whatever the shell was using as standard input). If
1870 your shell has the ``process substitution'' feature @samp{<(...)}, you
1871 can do this in just one step:
1874 xargs --null --arg-file=<(find / -name xyzzy -print0) munge
1877 @node Unusual Characters in File Names
1878 @subsubsection Unusual Characters in File Names
1879 As discussed above, you often need to be careful about how the names
1880 of files are handled by @code{find} and other programs. If the output
1881 of @code{find} is not going to another program but instead is being
1882 shown on a terminal, this can still be a problem. For example, some
1883 character sequences can reprogram the function keys on some terminals.
1884 @xref{Security Considerations}, for a discussion of other security
1885 problems relating to @code{find}.
1887 Unusual characters are handled differently by various
1888 actions, as described below.
1893 Always print the exact filename, unchanged, even if the output is
1894 going to a terminal.
1897 Always print the exact filename, unchanged. This will probably change
1898 in a future release.
1901 Unusual characters are always escaped. White space, backslash, and
1902 double quote characters are printed using C-style escaping (for
1903 example @samp{\f}, @samp{\"}). Other unusual characters are printed
1904 using an octal escape. Other Printable characters (for @samp{-ls} and
1905 @samp{-fls} these are the characters between octal 041 and 0176) are
1909 If the output is not going to a terminal, it is printed as-is.
1910 Otherwise, the result depends on which directive is in use:
1913 @item %D, %F, %H, %Y, %y
1914 These expand to values which are not under control of files' ownwers,
1915 and so are printed as-is.
1916 @item %a, %b, %c, %d, %g, %G, %i, %k, %m, %M, %n, %s, %t, %u, %U
1917 These have values which are under the control of files' ownwers but which
1918 cannot be used to send arbitrary data to the terminal, and so these
1920 @item %f, %h, %l, %p, %P
1921 The output of these directives is quoted if the output is going to a
1924 This quoting is performed in the same way as for GNU @code{ls}.
1925 This is not the same quoting mechanism as the one used for @samp{-ls} and
1926 @samp{fls}. If you are able to decide what format to use for the output
1927 of @code{find} then it is normally better to use @samp{\0} as a terminator
1928 than to use newline, as file names can contain white space and newline
1933 Quoting is handled in the same way as for the @samp{%p} directive
1934 of @samp{-printf} and @samp{-fprintf}. If you are using @code{find} in
1935 a script or in a situation where the matched files might have
1936 arbitrary names, you should consider using @samp{-print0} instead of
1941 The @code{locate} program quotes and escapes unusual characters in
1942 file names in the same way as @code{find}'s @samp{-print} action.
1944 The behaviours described above may change soon, as the treatment of
1945 unprintable characters is harmonised for @samp{-ls}, @samp{-fls},
1946 @samp{-print}, @samp{-fprint}, @samp{-printf} and @samp{-fprintf}.
1948 @node Limiting Command Size
1949 @subsubsection Limiting Command Size
1951 @code{xargs} gives you control over how many arguments it passes to the
1952 command each time it executes it. By default, it uses up to
1953 @code{ARG_MAX} - 2k, or 128k, whichever is smaller, characters per
1954 command. It uses as many lines and arguments as fit within that limit.
1955 The following options modify those values.
1958 @item --no-run-if-empty
1960 If the standard input does not contain any nonblanks, do not run the
1961 command. By default, the command is run once even if there is no input.
1963 @item --max-lines@r{[}=@var{max-lines}@r{]}
1964 @itemx -l@r{[}@var{max-lines}@r{]}
1965 Use at most @var{max-lines} nonblank input lines per command line;
1966 @var{max-lines} defaults to 1 if omitted. Trailing blanks cause an
1967 input line to be logically continued on the next input line, for the
1968 purpose of counting the lines. Implies @samp{-x}.
1970 @item --max-args=@var{max-args}
1971 @itemx -n @var{max-args}
1972 Use at most @var{max-args} arguments per command line. Fewer than
1973 @var{max-args} arguments will be used if the size (see the @samp{-s}
1974 option) is exceeded, unless the @samp{-x} option is given, in which case
1975 @code{xargs} will exit.
1977 @item --max-chars=@var{max-chars}
1978 @itemx -s @var{max-chars}
1979 Use at most @var{max-chars} characters per command line, including the
1980 command and initial arguments and the terminating nulls at the ends of
1981 the argument strings. If you specify a value for this option which is
1982 too large or small, a warning message is printed and the appropriate
1983 upper or lower limit is used instead.
1985 @item --max-procs=@var{max-procs}
1986 @itemx -P @var{max-procs}
1987 Run up to @var{max-procs} processes at a time; the default is 1. If
1988 @var{max-procs} is 0, @code{xargs} will run as many processes as
1989 possible at a time. Use the @samp{-n}, @samp{-s}, or @samp{-l} option
1990 with @samp{-P}; otherwise chances are that the command will be run only
1994 @node Interspersing File Names
1995 @subsubsection Interspersing File Names
1997 @code{xargs} can insert the name of the file it is processing between
1998 arguments you give for the command. Unless you also give options to
1999 limit the command size (@pxref{Limiting Command Size}), this mode of
2000 operation is equivalent to @samp{find -exec} (@pxref{Single File}).
2003 @item --replace@r{[}=@var{replace-str}@r{]}
2004 @itemx -i@r{[}@var{replace-str}@r{]}
2005 Replace occurrences of @var{replace-str} in the initial arguments with
2006 names read from the input. Also, unquoted blanks do not
2007 terminate arguments; instead, the input is split at newlines only. If
2008 @var{replace-str} is omitted, it defaults to @samp{@{@}} (like for
2009 @samp{find -exec}). Implies @samp{-x} and @samp{-l 1}. As an
2010 example, to sort each file the @file{bills} directory, leaving the
2011 output in that file name with @file{.sorted} appended, you could do:
2014 find bills -type f | xargs -iXX sort -o XX.sorted XX
2018 The equivalent command using @samp{find -execdir} is:
2021 find bills -type f -execdir sort -o '@{@}.sorted' '@{@}' ';'
2026 @subsection Querying
2028 To ask the user whether to execute a command on a single file, you can
2029 use the @code{find} primary @samp{-okdir} instead of @samp{-execdir},
2030 and the @code{find} primary @samp{-ok} instead of @samp{-exec}:
2032 @deffn Action -okdir command ;
2033 Like @samp{-execdir} (@pxref{Single File}), but ask the user first (on
2034 the standard input); if the response does not start with @samp{y} or
2035 @samp{Y}, do not run the command, and return false.
2038 @deffn Action -ok command ;
2039 This insecure variant of the @samp{-okdir} action is specified by
2040 POSIX. The main difference is that the command is executed in the
2041 directory from which @code{find} was invoked, meaning that @samp{@{@}}
2042 is expanded to a relative path starting with the name of one of the
2043 starting directories, rather than just the basename of the matched
2047 When processing multiple files with a single command, to query the user
2048 you give @code{xargs} the following option. When using this option, you
2049 might find it useful to control the number of files processed per
2050 invocation of the command (@pxref{Limiting Command Size}).
2055 Prompt the user about whether to run each command line and read a line
2056 from the terminal. Only run the command line if the response starts
2057 with @samp{y} or @samp{Y}. Implies @samp{-t}.
2061 @section Delete Files
2063 @deffn Action -delete
2064 Delete files or directories; true if removal succeeded. If the
2065 removal failed, an error message is issued.
2067 The use of the @samp{-delete} action on the command line automatically
2068 turns on the @samp{-depth} option (@pxref{find Expressions}).
2072 @section Adding Tests
2074 You can test for file attributes that none of the @code{find} builtin
2075 tests check. To do this, use @code{xargs} to run a program that filters
2076 a list of files printed by @code{find}. If possible, use @code{find}
2077 builtin tests to pare down the list, so the program run by @code{xargs}
2078 has less work to do. The tests builtin to @code{find} will likely run
2079 faster than tests that other programs perform.
2081 For reasons of efficiency it is often useful to limit the number of
2082 times an external program has to be run. For this reason, it is often
2083 a good idea to implement ``extended'' tests by using @code{xargs}.
2085 For example, here is a way to print the names of all of the unstripped
2086 binaries in the @file{/usr/local} directory tree. Builtin tests avoid
2087 running @code{file} on files that are not regular files or are not
2091 find /usr/local -type f -perm /a=x | xargs file |
2092 grep 'not stripped' | cut -d: -f1
2096 The @code{cut} program removes everything after the file name from the
2097 output of @code{file}.
2099 However, using @code{xargs} can present important security problems
2100 (@pxref{Security Considerations}). These can be avoided by using
2101 @samp{-execdir}. The @samp{-execdir} action is also a useful way of
2102 putting your own test in the middle of a set of other tests or actions
2103 for @code{find} (for example, you might want to use @samp{-prune}).
2105 @c Idea from Martin Weitzel.
2106 To place a special test somewhere in the middle of a @code{find}
2107 expression, you can use @samp{-execdir} (or, less securely,
2108 @samp{-exec}) to run a program that performs the test. Because
2109 @samp{-execdir} evaluates to the exit status of the executed program,
2110 you can use a program (which can be a shell script) that tests for a
2111 special attribute and make it exit with a true (zero) or false
2112 (non-zero) status. It is a good idea to place such a special test
2113 @emph{after} the builtin tests, because it starts a new process which
2114 could be avoided if a builtin test evaluates to false.
2116 Here is a shell script called @code{unstripped} that checks whether its
2117 argument is an unstripped binary file:
2121 file "$1" | grep -q "not stripped"
2125 This script relies on the fact that the shell exits with the status of
2126 the last command in the pipeline, in this case @code{grep}. The
2127 @code{grep} command exits with a true status if it found any matches,
2128 false if not. Here is an example of using the script (assuming it is
2129 in your search path). It lists the stripped executables (and shell
2130 scripts) in the file @file{sbins} and the unstripped ones in
2134 find /usr/local -type f -perm /a=x \
2135 \( -execdir unstripped '@{@}' \; -fprint ubins -o -fprint sbins \)
2140 @node Common Tasks, Databases, Actions, Top
2141 @chapter Common Tasks
2143 The sections that follow contain some extended examples that both give a
2144 good idea of the power of these programs, and show you how to solve
2145 common real-world problems.
2148 * Viewing And Editing::
2151 * Strange File Names::
2152 * Fixing Permissions::
2153 * Classifying Files::
2156 @node Viewing And Editing
2157 @section Viewing And Editing
2159 To view a list of files that meet certain criteria, simply run your file
2160 viewing program with the file names as arguments. Shells substitute a
2161 command enclosed in backquotes with its output, so the whole command
2165 less `find /usr/include -name '*.h' | xargs grep -l mode_t`
2169 You can edit those files by giving an editor name instead of a file
2173 emacs `find /usr/include -name '*.h' | xargs grep -l mode_t`
2176 Because there is a limit to the length of any individual command line,
2177 there is a limit to the number of files that can be handled in this
2178 way. We can get around this difficulty by using xargs like this:
2181 find /usr/include -name '*.h' | xargs grep -l mode_t > todo
2182 xargs --arg-file=todo emacs
2185 Here, @code{xargs} will run @code{emacs} as many times as necessary to
2186 visit all of the files listed in the file @file{todo}.
2191 You can pass a list of files produced by @code{find} to a file archiving
2192 program. GNU @code{tar} and @code{cpio} can both read lists of file
2193 names from the standard input---either delimited by nulls (the safe way)
2194 or by blanks (the lazy, risky default way). To use null-delimited
2195 names, give them the @samp{--null} option. You can store a file archive
2196 in a file, write it on a tape, or send it over a network to extract on
2199 One common use of @code{find} to archive files is to send a list of the
2200 files in a directory tree to @code{cpio}. Use @samp{-depth} so if a
2201 directory does not have write permission for its owner, its contents can
2202 still be restored from the archive since the directory's permissions are
2203 restored after its contents. Here is an example of doing this using
2204 @code{cpio}; you could use a more complex @code{find} expression to
2205 archive only certain files.
2208 find . -depth -print0 |
2209 cpio --create --null --format=crc --file=/dev/nrst0
2212 You could restore that archive using this command:
2215 cpio --extract --null --make-dir --unconditional \
2216 --preserve --file=/dev/nrst0
2219 Here are the commands to do the same things using @code{tar}:
2222 find . -depth -print0 |
2223 tar --create --null --files-from=- --file=/dev/nrst0
2225 tar --extract --null --preserve-perm --same-owner \
2229 @c Idea from Rick Sladkey.
2230 Here is an example of copying a directory from one machine to another:
2233 find . -depth -print0 | cpio -0o -Hnewc |
2234 rsh @var{other-machine} "cd `pwd` && cpio -i0dum"
2238 @section Cleaning Up
2240 @c Idea from Jim Meyering.
2241 This section gives examples of removing unwanted files in various situations.
2242 Here is a command to remove the CVS backup files created when an update
2246 find . -name '.#*' -print0 | xargs -0r rm -f
2249 The command above works, but the following is safer:
2252 find . -name '.#*' -depth -delete
2255 @c Idea from Franc,ois Pinard.
2256 You can run this command to clean out your clutter in @file{/tmp}. You
2257 might place it in the file your shell runs when you log out
2258 (@file{.bash_logout}, @file{.logout}, or @file{.zlogout}, depending on
2259 which shell you use).
2262 find /tmp -depth -user "$LOGNAME" -type f -delete
2265 If your @code{find} command removes directories, you may find that
2266 you get a spurious error message when @code{find} tries to recurse
2267 into a directory that has now been removed. Using the @samp{-depth}
2268 option will normally resolve this problem.
2270 @c Idea from Noah Friedman.
2271 To remove old Emacs backup and auto-save files, you can use a command
2272 like the following. It is especially important in this case to use
2273 null-terminated file names because Emacs packages like the VM mailer
2274 often create temporary file names with spaces in them, like @file{#reply
2275 to David J. MacKenzie<1>#}.
2278 find ~ \( -name '*~' -o -name '#*#' \) -print0 |
2279 xargs --no-run-if-empty --null rm -vf
2282 Removing old files from @file{/tmp} is commonly done from @code{cron}:
2284 @c Idea from Kaveh Ghazi.
2286 find /tmp /var/tmp -not -type d -mtime +3 -delete
2287 find /tmp /var/tmp -depth -mindepth 1 -type d -empty -delete
2290 The second @code{find} command above uses @samp{-depth} so it cleans out
2291 empty directories depth-first, hoping that the parents become empty and
2292 can be removed too. It uses @samp{-mindepth} to avoid removing
2293 @file{/tmp} itself if it becomes totally empty.
2295 @node Strange File Names
2296 @section Strange File Names
2299 @c From: tmatimar@isgtec.com (Ted Timar)
2300 @c Newsgroups: comp.unix.questions,comp.unix.shell,comp.answers,news.answers
2301 @c Subject: Unix - Frequently Asked Questions (2/7) [Frequent posting]
2302 @c Subject: How do I remove a file with funny characters in the filename ?
2303 @c Date: Thu Mar 18 17:16:55 EST 1993
2304 @code{find} can help you remove or rename a file with strange characters
2305 in its name. People are sometimes stymied by files whose names contain
2306 characters such as spaces, tabs, control characters, or characters with
2307 the high bit set. The simplest way to remove such files is:
2310 rm -i @var{some*pattern*that*matches*the*problem*file}
2313 @code{rm} asks you whether to remove each file matching the given
2314 pattern. If you are using an old shell, this approach might not work if
2315 the file name contains a character with the high bit set; the shell may
2316 strip it off. A more reliable way is:
2319 find . -maxdepth 1 @var{tests} -okdir rm '@{@}' \;
2323 where @var{tests} uniquely identify the file. The @samp{-maxdepth 1}
2324 option prevents @code{find} from wasting time searching for the file in
2325 any subdirectories; if there are no subdirectories, you may omit it. A
2326 good way to uniquely identify the problem file is to figure out its
2333 Suppose you have a file whose name contains control characters, and you
2334 have found that its inode number is 12345. This command prompts you for
2335 whether to remove it:
2338 find . -maxdepth 1 -inum 12345 -okdir rm -f '@{@}' \;
2341 If you don't want to be asked, perhaps because the file name may contain
2342 a strange character sequence that will mess up your screen when printed,
2343 then use @samp{-execdir} instead of @samp{-okdir}.
2345 If you want to rename the file instead, you can use @code{mv} instead of
2349 find . -maxdepth 1 -inum 12345 -okdir mv '@{@}' @var{new-file-name} \;
2352 @node Fixing Permissions
2353 @section Fixing Permissions
2355 Suppose you want to make sure that everyone can write to the directories in a
2356 certain directory tree. Here is a way to find directories lacking either
2357 user or group write permission (or both), and fix their permissions:
2360 find . -type d -not -perm -ug=w | xargs chmod ug+w
2364 You could also reverse the operations, if you want to make sure that
2365 directories do @emph{not} have world write permission.
2367 @node Classifying Files
2368 @section Classifying Files
2371 @c From: martin@mwtech.UUCP (Martin Weitzel)
2372 @c Newsgroups: comp.unix.wizards,comp.unix.questions
2373 @c Subject: Advanced usage of 'find' (Re: Unix security automating script)
2374 @c Date: 22 Mar 90 15:05:19 GMT
2375 If you want to classify a set of files into several groups based on
2376 different criteria, you can use the comma operator to perform multiple
2377 independent tests on the files. Here is an example:
2380 find / -type d \( -perm -o=w -fprint allwrite , \
2381 -perm -o=x -fprint allexec \)
2383 echo "Directories that can be written to by everyone:"
2386 echo "Directories with search permissions for everyone:"
2390 @code{find} has only to make one scan through the directory tree (which
2391 is one of the most time consuming parts of its work).
2393 @node Databases, File Permissions, Common Tasks, Top
2394 @chapter File Name Databases
2396 The file name databases used by @code{locate} contain lists of files
2397 that were in particular directory trees when the databases were last
2398 updated. The file name of the default database is determined when
2399 @code{locate} and @code{updatedb} are configured and installed. The
2400 frequency with which the databases are updated and the directories for
2401 which they contain entries depend on how often @code{updatedb} is run,
2402 and with which arguments.
2404 You can obtain some statistics about the databases by using
2405 @samp{locate --statistics}.
2408 * Database Locations::
2409 * Database Formats::
2410 * Newline Handling::
2414 @node Database Locations
2415 @section Database Locations
2417 There can be multiple file name databases. Users can select which
2418 databases @code{locate} searches using the @code{LOCATE_PATH}
2419 environment variable or a command line option. The system
2420 administrator can choose the file name of the default database, the
2421 frequency with which the databases are updated, and the directories
2422 for which they contain entries. File name databases are updated by
2423 running the @code{updatedb} program, typically nightly.
2425 In networked environments, it often makes sense to build a database at
2426 the root of each filesystem, containing the entries for that filesystem.
2427 @code{updatedb} is then run for each filesystem on the fileserver where
2428 that filesystem is on a local disk, to prevent thrashing the network.
2430 @xref{Invoking updatedb},
2431 for the description of the options to @code{updatedb}, which specify
2432 which directories would each database contain entries for.
2435 @node Database Formats
2436 @section Database Formats
2438 The file name databases contain lists of files that were in particular
2439 directory trees when the databases were last updated. The file name
2440 database format changed starting with GNU @code{locate} version 4.0 to
2441 allow machines with different byte orderings to share the databases. The
2442 new GNU @code{locate} can read both the old and new database formats.
2443 However, old versions of @code{locate} and @code{find} produce incorrect
2444 results if given a new-format database.
2446 If you run @samp{locate --statistics}, the resulting summary indicates
2447 the type of each locate database.
2451 * New Database Format::
2453 * Old Database Format::
2456 @node New Database Format
2457 @subsection New Database Format
2459 @code{updatedb} runs a program called @code{frcode} to
2460 @dfn{front-compress} the list of file names, which reduces the database
2461 size by a factor of 4 to 5. Front-compression (also known as
2462 incremental encoding) works as follows.
2464 The database entries are a sorted list (case-insensitively, for users'
2465 convenience). Since the list is sorted, each entry is likely to share a
2466 prefix (initial string) with the previous entry. Each database entry
2467 begins with an offset-differential count byte, which is the additional
2468 number of characters of prefix of the preceding entry to use beyond the
2469 number that the preceding entry is using of its predecessor. (The
2470 counts can be negative.) Following the count is a null-terminated ASCII
2471 remainder---the part of the name that follows the shared prefix.
2473 If the offset-differential count is larger than can be stored in a byte
2474 (+/-127), the byte has the value 0x80 and the count follows in a 2-byte
2475 word, with the high byte first (network byte order).
2477 Every database begins with a dummy entry for a file called
2478 @file{LOCATE02}, which @code{locate} checks for to ensure that the
2479 database file has the correct format; it ignores the entry in doing the
2482 Databases can not be concatenated together, even if the first (dummy)
2483 entry is trimmed from all but the first database. This is because the
2484 offset-differential count in the first entry of the second and following
2485 databases will be wrong.
2487 In the output of @samp{locate --statistics}, the new database format
2488 is referred to as @samp{LOCATE02}.
2490 @node Sample Database
2491 @subsection Sample Database
2493 Sample input to @code{frcode}:
2494 @c with nulls changed to newlines:
2498 /usr/src/cmd/aardvark.c
2499 /usr/src/cmd/armadillo.c
2503 Length of the longest prefix of the preceding entry to share:
2512 Output from @code{frcode}, with trailing nulls changed to newlines
2513 and count bytes made printable:
2523 (6 = 14 - 8, and -9 = 5 - 14)
2525 @node Old Database Format
2526 @subsection Old Database Format
2528 The old database format is used by Unix @code{locate} and @code{find}
2529 programs and earlier releases of the GNU ones. @code{updatedb} produces
2530 this format if given the @samp{--old-format} option.
2532 @code{updatedb} runs programs called @code{bigram} and @code{code} to
2533 produce old-format databases. The old format differs from the new one
2534 in the following ways. Instead of each entry starting with an
2535 offset-differential count byte and ending with a null, byte values from
2536 0 through 28 indicate offset-differential counts from -14 through 14.
2537 The byte value indicating that a long offset-differential count follows
2538 is 0x1e (30), not 0x80. The long counts are stored in host byte order,
2539 which is not necessarily network byte order, and host integer word size,
2540 which is usually 4 bytes. They also represent a count 14 less than
2541 their value. The database lines have no termination byte; the start of
2542 the next line is indicated by its first byte having a value <= 30.
2544 In addition, instead of starting with a dummy entry, the old database
2545 format starts with a 256 byte table containing the 128 most common
2546 bigrams in the file list. A bigram is a pair of adjacent bytes. Bytes
2547 in the database that have the high bit set are indexes (with the high
2548 bit cleared) into the bigram table. The bigram and offset-differential
2549 count coding makes these databases 20-25% smaller than the new format,
2550 but makes them not 8-bit clean. Any byte in a file name that is in the
2551 ranges used for the special codes is replaced in the database by a
2552 question mark, which not coincidentally is the shell wildcard to match a
2555 The old format therefore can not faithfully store entries with non-ASCII
2556 characters. It therefore should not be used in internationalized
2559 The output of @samp{locate --statistics} will give an incorrect count
2560 of the number of filenames containing newlines or high-bit characters
2561 for old-format databases.
2563 @node Newline Handling
2564 @section Newline Handling
2566 Within the database, filenames are terminated with a null character.
2567 This is the case for both the old and the new format.
2569 When the new database format is being used, the compression technique
2570 used to generate the database though relies on the ability to sort the
2571 list of files before they are presented to @code{frcode}.
2573 If the system's sort command allows its input list of files to be
2574 separated with null characters via the @samp{-z} option, this option
2575 is used and therefore @code{updatedb} and @code{locate} will both
2576 correctly handle filenames containing newlines. If the @code{sort}
2577 command lacks support for this, the list of files is delimited with
2578 the newline character, meaning that parts of filenames containing
2579 newlines will be incorrectly sorted. This can result in both
2580 incorrect matches and incorrect failures to match.
2582 On the other hand, if you are using the old database format, filenames
2583 with embedded newlines are not correctly handled. There is no
2584 technical limitation which enforces this, it's just that the
2585 @code{bigram} program has no been updated to support lists of
2586 filenames separated by nulls.
2588 So, if you are using the new database format (this is the default) and
2589 your system uses GNU @code{find}, newlines will be correctly handled
2590 at all times. Otherwise, newlines may not be correctly handled.
2592 @node File Permissions, Reference, Databases, Top
2593 @chapter File Permissions
2597 @node Reference, Security Considerations, File Permissions, Top
2600 Below are summaries of the command line syntax for the programs
2601 discussed in this manual.
2606 * Invoking updatedb::
2610 @node Invoking find, Invoking locate, , Reference
2611 @section Invoking @code{find}
2614 find @r{[-H] [-L] [-P]} @r{[}@var{file}@dots{}@r{]} @r{[}@var{expression}@r{]}
2617 @code{find} searches the directory tree rooted at each file name
2618 @var{file} by evaluating the @var{expression} on each file it finds in
2621 The options @samp{-H}, @samp{-L} or @samp{-P} may be specified at the
2622 start of the command line (if none of these is specified, @samp{-P} is
2623 assumed). The arguments after these are a list of files or
2624 directories that should be searched.
2626 This list of files to search is followed by a list of expressions
2627 describing the files we wish to search for. The first part of the
2628 expression is recognised by the fact that it begins with @samp{-},
2629 @samp{(}, @samp{)}, @samp{,}, or @samp{!}. Any arguments after it are
2630 the rest of the expression. If no paths are given, the current
2631 directory is used. If no expression is given, the expression
2632 @samp{-print} is used.
2634 @code{find} exits with status 0 if all files are processed successfully,
2635 greater than 0 if errors occur.
2637 Three options can precede the list of path names. They determine the
2638 way that symbolic links are handled.
2642 Never follow symbolic links (this is the default), except in the case
2643 of the @samp{-xtype} predicate.
2645 Always follow symbolic links, except in the case of the @samp{-xtype}
2648 Follow symbolic links specified in the list of paths to search, or
2649 which are otherwise specified on the command line.
2652 If @code{find} would follow a symbolic link, but cannot for any reason
2653 (for example, because it has insufficient permissions or the link is
2654 broken), it falls back on using the properties of the symbolic link
2655 itself. @ref{Symbolic Links} for a more complete description of how
2656 symbolic links are handled.
2658 @xref{Primary Index}, for a summary of all of the tests, actions, and
2659 options that the expression can contain. If the expression is
2660 missing, @samp{-print} is assumed.
2664 @code{find} also recognizes two options for administrative use:
2668 Print a summary of the command-line argument format and exit.
2670 Print the version number of @code{find} and exit.
2675 * Warning Messages::
2679 @node Warning Messages,,, Invoking find
2680 @subsection Warning Messages
2682 If there is an error on the @code{find} command line, an error message
2683 is normally issued. However, there are some usages that are
2684 inadvisable but which @code{find} should still accept. Under these
2685 circumstances, @code{find} may issue a warning message. By default,
2686 warnings are enabled only if @code{find} is being run interactively
2687 (specifically, if the standard input is a terminal). Warning messages
2688 can be controlled explicitly by the use of options on the command
2693 Issue warning messages where appropriate.
2695 Do not issue warning messages.
2698 These options take effect at the point on the command line where they
2699 are specified. Therefore if you specify @samp{-nowarn} at the end of
2700 the command line, you will not see warning messages for any problems
2701 occurring before that. The warning messages affected by the above
2702 options are triggered by:
2706 Use of the @samp{-d} option which is deprecated; please use
2707 @samp{-depth} instead, since the latter is POSIX-compliant.
2709 Use of the @samp{-ipath} option which is deprecated; please use
2710 @samp{-iwholename} instead.
2712 Specifying an option (for example @samp{-mindepth}) after a non-option
2713 (for example @samp{-type} or @samp{-print}) on the command line.
2717 The default behaviour above is designed to work in that way so that
2718 existing shell scripts which use such constructs don't generate
2719 spurious errors, but people will be made aware of the problem.
2721 Some warning messages are issued for less common or more serious
2722 problems, and so cannot be turned off:
2726 Use of an unrecognised backslash escape sequence with @samp{-fprintf}
2728 Use of an unrecognised formatting directive with @samp{-fprintf}
2731 @node Invoking locate, Invoking updatedb, Invoking find, Reference
2732 @section Invoking @code{locate}
2735 locate @r{[}@var{option}@dots{}@r{]} @var{pattern}@dots{}
2741 Print only names which match all non-option arguments, not those matching
2742 one or more non-option arguments.
2746 The specified pattern is matched against just the last component of
2747 the name of the file in the locate database. This last component is
2748 also called the ``base name''. For example, the base name of
2749 @file{/tmp/mystuff/foo.old.c} is @file{foo.old.c}. If the pattern
2750 contains metacharacters, it must match the base name exactly. If not,
2751 it must match part of the base name.
2755 Instead of printing the matched filenames, just print the total
2756 number of matches we found, unless @samp{--print} (@samp{-p}) is also
2760 @item --database=@var{path}
2761 @itemx -d @var{path}
2762 Instead of searching the default file name database, search the file
2763 name databases in @var{path}, which is a colon-separated list of
2764 database file names. You can also use the environment variable
2765 @code{LOCATE_PATH} to set the list of database files to search. The
2766 option overrides the environment variable if both are used. Empty
2767 elements in @var{path} (that is, a leading or trailing colon, or two
2768 colons in a row) are taken to stand for the default database.
2769 A database can be supplied on stdin, using @samp{-} as an element
2770 of @samp{path}. If more than one element of @samp{path} is @samp{-},
2771 later instances are ignored (but a warning message is printed).
2775 Only print out such names which currently exist (instead of such names
2776 which existed when the database was created). Note that this may slow
2777 down the program a lot, if there are many matches in the database.
2778 The way in which broken symbolic links are treated is affected by the
2779 @samp{-L}, @samp{-P} and @samp{-H} options.
2781 @item --non-existing
2783 Only print out such names which currently do not exist (instead of
2784 such names which existed when the database was created). Note that
2785 this may slow down the program a lot, if there are many matches in the
2786 database. The way in which broken symbolic links are treated is
2787 affected by the @samp{-L}, @samp{-P} and @samp{-H} options.
2791 If testing for the existence of files (with the @samp{-e} or @samp{-E}
2792 options), consider broken symbolic links to be non-existing. This is
2799 If testing for the existence of files (with the @samp{-e} or @samp{-E}
2800 options), treat broken symbolic links as if they were existing files.
2801 The @samp{-H} form of this option is provided purely for similarity
2802 with @code{find}; the use of @samp{-P} is recommended over @samp{-H}.
2806 Ignore case distinctions in both the pattern and the file names.
2810 Limit the number of results printed to N. If you use the
2811 @samp{--count} option, the value printed will never be larger than
2816 Accepted but does nothing. The option is supported only to provide
2817 compatibility with BSD's @code{locate}.
2821 Results are separated with the ASCII NUL character rather than the
2822 newline character. To get the full benefit of the use of this option,
2823 use the new locate database format (that is the default anyway).
2827 Print search results when they normally would not, because of the presence
2828 of @samp{--statistics} (@samp{-S}) or @samp{--count} (@samp{-c}).
2832 The specified pattern is matched against the whole name of the file in
2833 the locate database. If the pattern contains metacharacters, it must
2834 match exactly. If not, it must match part of the whole file name.
2835 This is the default behaviour.
2839 Instead of using substring or shell glob matching, the pattern
2840 specified on the command line is understood to be a POSIX extended
2841 regular expression. Filenames from the locate database which match
2842 the specified regular expression are printed (or counted). If the
2843 @samp{-i} flag is also given, matching is case-insensitive. Matches
2844 are performed against the whole path name, and so by default a
2845 pathname will be matched if any part of it matches the specified
2846 regular expression. The regular expression may use @samp{^} or
2847 @samp{$} to anchor a match at the beginning or end of a pathname.
2851 Accepted but does nothing. The option is supported only to provide
2852 compatibility with BSD's @code{locate}.
2856 Print some summary information for each locate database. No search is
2857 performed unless non-option arguments are given.
2860 Print a summary of the options to @code{locate} and exit.
2863 Print the version number of @code{locate} and exit.
2866 @node Invoking updatedb, Invoking xargs, Invoking locate, Reference
2867 @section Invoking @code{updatedb}
2870 updatedb @r{[}@var{option}@dots{}@r{]}
2874 @item --findoptions='@var{OPTION}@dots{}'
2875 Global options to pass on to @code{find}.
2876 The environment variable @code{FINDOPTIONS} also sets this value.
2879 @item --localpaths='@var{path}@dots{}'
2880 Non-network directories to put in the database.
2881 Default is @file{/}.
2883 @item --netpaths='@var{path}@dots{}'
2884 Network (NFS, AFS, RFS, etc.) directories to put in the database.
2885 The environment variable @code{NETPATHS} also sets this value.
2888 @item --prunepaths='@var{path}@dots{}'
2889 Directories to omit from the database, which would otherwise be included.
2890 The environment variable @code{PRUNEPATHS} also sets this value.
2891 Default is @file{/tmp /usr/tmp /var/tmp /afs}.
2893 @item --prunefs='@var{path}@dots{}'
2894 File systems to omit from the database, which would otherwise be included.
2895 Note that files are pruned when a file system is reached;
2896 Any file system mounted under an undesired file system will be
2898 The environment variable @code{PRUNEFS} also sets this value.
2899 Default is @file{nfs NFS proc}.
2901 @item --output=@var{dbfile}
2902 The database file to build.
2903 Default is system-dependent, but typically @file{/usr/local/var/locatedb}.
2905 @item --localuser=@var{user}
2906 The user to search the non-network directories as, using @code{su}.
2907 Default is to search the non-network directories as the current user.
2908 You can also use the environment variable @code{LOCALUSER} to set this user.
2910 @item --netuser=@var{user}
2911 The user to search network directories as, using @code{su}.
2912 Default is @code{daemon}.
2913 You can also use the environment variable @code{NETUSER} to set this user.
2916 Generate a locate database in the old format, for compatibility with
2917 versions of @code{locate} other than GNU @code{locate}. Using this
2918 option means that @code{locate} will not be able to properly handle
2919 non-ASCII characters in filenames (that is, filenames containing
2920 characters which have the eighth bit set, such as many of the
2921 characters from the ISO-8859-1 character set).
2923 Print a summary of the command-line argument format and exit.
2925 Print the version number of @code{updatedb} and exit.
2928 @node Invoking xargs, , Invoking updatedb, Reference
2929 @section Invoking @code{xargs}
2932 xargs @r{[}@var{option}@dots{}@r{]} @r{[}@var{command} @r{[}@var{initial-arguments}@r{]}@r{]}
2935 @code{xargs} exits with the following status:
2941 if any invocation of the command exited with status 1-125
2943 if the command exited with status 255
2945 if the command is killed by a signal
2947 if the command cannot be run
2949 if the command is not found
2951 if some other error occurred.
2955 @item --arg-file@r{=@var{inputfile}}
2956 @itemx -a @r{=@var{inputfile}}
2957 Read names from the file @var{inputfile} instead of standard input.
2961 Input filenames are terminated by a null character instead of by
2962 whitespace, and the quotes and backslash are not special (every
2963 character is taken literally). Disables the end of file string, which
2964 is treated like any other argument.
2966 @item --eof@r{[}=@var{eof-str}@r{]}
2967 @itemx -e@r{[}@var{eof-str}@r{]}
2968 Set the end of file string to @var{eof-str}. If the end of file string
2969 occurs as a line of input, the rest of the input is ignored. If
2970 @var{eof-str} is omitted, there is no end of file string. If this
2971 option is not given, the end of file string defaults to @samp{_}.
2974 Print a summary of the options to @code{xargs} and exit.
2976 @item --replace@r{[}=@var{replace-str}@r{]}
2977 @itemx -i@r{[}@var{replace-str}@r{]}
2978 Replace occurrences of @var{replace-str} in the initial arguments with
2979 names read from standard input. Also, unquoted blanks do not
2980 terminate arguments; instead, the input is split at newlines only.
2981 If @var{replace-str} is omitted, it defaults to @samp{@{@}}
2982 (like for @samp{find -exec}). Implies @samp{-x} and @samp{-l 1}.
2984 @item --max-lines@r{[}=@var{max-lines}@r{]}
2985 @itemx -l@r{[}@var{max-lines}@r{]}
2986 Use at most @var{max-lines} nonblank input lines per command line;
2987 @var{max-lines} defaults to 1 if omitted. Trailing blanks cause an
2988 input line to be logically continued on the next input line, for the
2989 purpose of counting the lines. Implies @samp{-x}.
2991 @item --max-args=@var{max-args}
2992 @itemx -n @var{max-args}
2993 Use at most @var{max-args} arguments per command line. Fewer than
2994 @var{max-args} arguments will be used if the size (see the @samp{-s}
2995 option) is exceeded, unless the @samp{-x} option is given, in which case
2996 @code{xargs} will exit.
3000 Prompt the user about whether to run each command line and read a line
3001 from the terminal. Only run the command line if the response starts
3002 with @samp{y} or @samp{Y}. Implies @samp{-t}.
3004 @item --no-run-if-empty
3006 If the standard input does not contain any nonblanks, do not run the
3007 command. By default, the command is run once even if there is no input.
3009 @item --max-chars=@var{max-chars}
3010 @itemx -s @var{max-chars}
3011 Use at most @var{max-chars} characters per command line, including the
3012 command and initial arguments and the terminating nulls at the ends of
3013 the argument strings.
3017 Print the command line on the standard error output before executing
3021 Print the version number of @code{xargs} and exit.
3025 Exit if the size (see the @samp{-s} option) is exceeded.
3028 @item --max-procs=@var{max-procs}
3029 @itemx -P @var{max-procs}
3030 Run up to @var{max-procs} processes at a time; the default is 1. If
3031 @var{max-procs} is 0, @code{xargs} will run as many processes as
3035 @node Security Considerations, Error Messages, Reference, Top
3036 @chapter Security Considerations
3038 Security considerations are important if you are using @code{find} or
3039 @code{xargs} to search for or process files that don't belong to you
3040 or over which other people have control. Security considerations
3041 relating to @code{locate} may also apply if you have files which you
3042 may not want others to see.
3044 In general, the most severe forms of security problems affecting
3045 @code{find} and related programs are where third parties can bring
3046 about a situation where those programs allow them to do something
3047 they would normally not be able to do. This is called @emph{privilege
3048 elevation}. This might include deleting files they would not normally
3049 be able to delete. It is also common for the system to periodically
3050 invoke @code{find} for housekeeping purposes. These invocations of
3051 @code{find} are particularly problematic from a security point of view
3052 as these are often invoked by the superuser and search the whole file
3053 system hierarchy. The severity of any associated problem depends on
3054 what the system is going to do with the output of @code{find}.
3057 * Levels of Risk:: What is your level of exposure to security problems?
3058 * Security Considerations for find:: Security problems with find
3059 * Security Considerations for xargs:: Security problems with xargs
3060 * Security Considerations for locate:: Security problems with locate
3061 * Security Summary:: That was all very complex, what does it boil down to?
3065 @node Levels of Risk
3066 @section Levels of Risk
3068 There are some security risks inherent in the use of @code{find},
3069 @code{xargs} and (to a lesser extent) @code{locate}. The severity of
3070 these risks depends on what sort of system you are using:
3074 Multi-user systems where you do not control (or trust) the other
3075 users, and on which you execute @code{find}, including areas where
3076 those other users can manipulate the filesystem (for example beneath
3077 @file{/home} or @file{/tmp}).
3080 Systems where the actions of other users can create filenames chosen
3081 by them, but to which they don't have access while @code{find} is
3082 being run. This access might include leaving programs running (shell
3083 background jobs, @code{at} or @code{cron} tasks, for example). On
3084 these sorts of systems, carefully written commands (avoiding use of
3085 @samp{-print} for example) should not expose you to a high degree of
3086 risk. Most systems fall into this category.
3089 Systems to which untrusted parties do not have access, cannot create
3090 filenames of their own choice (even remotely) and which contain no
3091 security flaws which might enable an untrusted third party to gain
3092 access. Most systems do not fall into this category because there are
3093 many ways in which external parties can affect the names of files that
3094 are created on your system. The system on which I am writing this for
3095 example automatically downloads software updates from the Internet;
3096 the names of the files in which these updates exist are chosen by
3097 third parties@footnote{Of course, I trust these parties to a large
3098 extent anyway, because I install software provided by them; I choose
3099 to trust them in this way, and that's a deliberate choice}.
3102 In the discussion above, ``risk'' denotes the likelihood that someone
3103 can cause @code{find}, @code{xargs}, @code{locate} or some other
3104 program which is controlled by them to do something you did not
3105 intend. The levels of risk suggested do not take any account of the
3106 consequences of this sort of event. That is, if you operate a ``low
3107 risk'' type system, but the consequences of a security problem are
3108 disastrous, then you should still give serious thought to all the
3109 possible security problems, many of which of course will not be
3110 discussed here -- this section of the manual is intended to be
3111 informative but not comprehensive or exhaustive.
3113 If you are responsible for the operation of a system where the
3114 consequences of a security problem could be very important, you should
3118 @item Define a security policy which defines who is allowed to do what
3120 @item Seek competent advice on how to enforce your policy, detect
3121 breaches of that policy, and take account of any potential problems
3122 that might fall outside the scope of your policy
3126 @node Security Considerations for find
3127 @section Security Considerations for find
3130 Some of the actions @code{find} might take have a direct effect;
3131 these include @code{-exec} and @code{-delete}. However, it is also
3132 common to use @code{-print} explicitly or implicitly, and so if
3133 @code{find} produces the wrong list of filenames, that can also be a
3134 security problem; consider the case for example where @code{find} is
3135 producing a list of files to be deleted.
3137 We normally assume that the @code{find} command line expresses the
3138 file selection criteria and actions that the user had in mind -- that
3139 is, the command line is ``trusted'' data.
3141 From a security analysis point of view, the output of @code{find}
3142 should be correct; that is, the output should contain only the names
3143 of those files which meet the user's criteria specified on the command
3144 line. This applies for the @code{-exec} and @code{-delete} actions;
3145 one can consider these to be part of the output.
3147 On the other hand, the contents of the filesystem can be manipulated
3148 by other people, and hence we regard this as ``untrusted'' data. This
3149 implies that the @code{find} command line is a filter which converts
3150 the untrusted contents of the filesystem into a correct list of output
3153 The filesystem will in general change while @code{find} is searching
3154 it; in fact, most of the potential security problems with @code{find}
3155 relate to this issue in some way.
3157 Race conditions are a general class of security problem where the
3158 relative ordering of actions taken by @code{find} (for example) and
3159 something else are important@footnote{This is more or less the
3160 definition of the term ``race condition''} .
3162 Typically, an attacker might move or rename files or directories in
3163 the hope that an action might be taken against a a file which was not
3164 normally intended to be affected. Alternatively, this sort of attack
3165 might be intended to persuade @code{find} to search part of the
3166 filesystem which would not normally be included in the search
3167 (defeating the @code{-prune} action for example).
3170 * Changing the Current Working Directory::
3171 * Race Conditions with -exec::
3172 * Race Conditions with -print and -print0::
3176 @node Changing the Current Working Directory
3177 @subsection Changing the Current Working Directory
3179 As find searches the file system, it finds subdirectories and then
3180 searches within them by changing its working directory. First,
3181 @code{find} notices a subdirectory. It then decides if that
3182 subdirectory meets the criteria for being searched; that is, any
3183 @samp{-xdev} or @samp{-prune} expressions are taken into account. The
3184 @code{find} program will then change working directory and proceed to
3185 search the directory.
3187 A race condition attack might take the form that once the checks
3188 relevant to @samp{-xdev} and @samp{-prune} have been done, an attacker
3189 might rename the directory that was being considered, and put in its
3190 place a symbolic link that actually points somewhere else.
3192 The idea behind this attack is to fool @code{find} into going into the
3193 wrong directory. This would leave @code{find} with a working
3194 directory chosen by an attacker, bypassing any protection apparently
3195 provided by @samp{-xdev} and @samp{-prune}, and any protection
3196 provided by being able to @emph{not} list particular directories on
3197 the @code{find} command line. This form of attack is particularly
3198 problematic if the attacker can predict when the @code{find} command
3199 will be run, as is the case with @code{cron} tasks for example.
3201 GNU @code{find} has specific safeguards to prevent this general class
3202 of problem. The exact form of these safeguards depends on the
3203 properties of your system.
3206 * O_NOFOLLOW:: Safely changing directory using fchdir().
3207 * Systems without O_NOFOLLOW:: Checking for symbolic links after chdir().
3208 * Working with automounters:: These can look like race condition exploits
3209 * Problems with dead NFS servers:: If you don't have O_NOFOLLOW, this is a problem.
3213 @subsubsection O_NOFOLLOW
3215 If your system supports the O_NOFOLLOW flag @footnote{GNU/Linux
3216 (kernel version 2.1.126 and later) and FreeBSD (3.0-CURRENT and later)
3217 support this} to the @code{open(2)} system call, @code{find} uses it
3218 when safely changing directory. The target directory is first opened
3219 and then @code{find} changes working directory with the
3220 @code{fchdir()} system call. This ensures that symbolic links are not
3221 followed, preventing the sort of race condition attack in which use
3222 is made of symbolic links.
3224 If for any reason this approach does not work, @code{find} will fall
3225 back on the method which is normally used if O_NOFOLLOW is not
3228 You can tell if your system supports O_NOFOLLOW by running
3234 This will tell you the version number and which features are enabled.
3235 For example, if I run this on my system now, this gives:
3237 GNU find version 4.2.18-CVS
3238 Features enabled: D_TYPE O_NOFOLLOW(enabled)
3241 Here, you can see that I am running a version of find which was built
3242 from the development (CVS) code prior to the release of
3243 findutils-4.2.18, and that the D_TYPE and O_NOFOLLOW features are
3244 present. O_NOFOLLOW is qualified with ``enabled''. This simply means
3245 that the current system seems to support O_NOFOLLOW. This check is
3246 needed because it is possible to build find on a system that defines
3247 O_NOFOLLOW and then run it on a system that ignores the O_NOFOLLOW
3248 flag. We try to detect such cases at startup by checking the
3249 operating system and version number; when this happens you will see
3250 ``O_NOFOLLOW(disabled)'' instead.
3252 @node Systems without O_NOFOLLOW
3253 @subsubsection Systems without O_NOFOLLOW
3255 The strategy for preventing this type of problem on systems that lack
3256 support for the O_NOFOLLOW flag is more complex. Each time
3257 @code{find} changes directory, it examines the directory it is about
3258 to move to, issues the @code{chdir()} system call, and then checks
3259 that it has ended up in the subdirectory it expected. If not, an
3260 error message is issued and @code{find} exits immediately. This
3261 method prevents filesystem manipulation attacks from persuading
3262 @code{find} to search parts of the filesystem it did not intend.
3263 However, we heve to take special steps in order not to unnecessarily
3264 conclude that there is a problem with any ``automount'' mount points.
3266 @node Working with automounters
3267 @subsubsection Working with automounters
3269 Where an automounter is in use it can be the case that the use of the
3270 @code{chdir()} system call can itself cause a new filesystem to be
3271 mounted at that point. On systems that do not support O_NOFOLLOW,
3272 this will cause @code{find}'s security check to fail.
3274 However, this does not normally represent a security problem (since
3275 the automounter configuration is normally set up by the system
3276 administrator). Therefore, if the @code{chdir()} sanity check fails,
3277 @code{find} will check to see if a new filesystem has been mounted at
3278 the current directory; if so, @code{find} will issue a warning message
3281 To make this solution work, @code{find} reads the list of mounted
3282 filesystems at startup, and again when the sanity check fails. It
3283 compares the two lists to find out if the directory it has moved into
3284 has just been mounted.
3286 @node Problems with dead NFS servers
3287 @subsubsection Problems with dead NFS servers
3289 Examining every mount point on the system has a downside too. In
3290 general, @code{find} will be used to search just part of the
3291 filesystem. However, @code{find} examines every mount point. If the
3292 system has a filesystem mounted on an unresponsive NFS server,
3293 @code{find} will hang, waiting for the NFS server to respond. Worse,
3294 it does this even if the affected mount point is not within the
3295 directory tree that find would have searched anyway.
3297 This is very unfortunate. However, this problem only affects systems
3298 that have no support for O_NOFOLLOW. As far as I can tell, it is not
3299 possible on such systems to fix all three problems (the race
3300 condition, the false-alarm at automount mount points, and the hang at
3301 startup if there is a dead NFS server) at once. If you have some
3302 ideas about how @code{find} could do this better, please send email to
3303 the @email{bug-findutils@@gnu.org} mailing list.
3305 @node Race Conditions with -exec
3306 @subsection Race Conditions with -exec
3308 The @samp{-exec} action causes another program to be run. It is
3309 passed the name of the file which is being considered at the time.
3310 The invoked program will then - normally - perform some action on that
3311 file. Once again, there is a race condition which can be exploited
3312 here. We shall take as a specific example the command
3315 find /tmp -path /tmp/umsp/passwd -exec /bin/rm
3318 In this simple example, we are identifying just one file to be deleted
3319 and invoking @code{/bin/rm} to delete it. A problem exists because
3320 there is a time gap between the point where @code{find} decides that
3321 it needs to process the @samp{-exec} action and the point where the
3322 @code{/bin/rm} command actually issues the @code{unlink()} system
3323 call. Within this time period, an attacker can rename the
3324 @file{/tmp/umsp} directory, replacing it with a symbolic link to
3325 @file{/etc}. There is no way for @code{/bin/rm} to determine that it
3326 is working on the same file that @code{find} had in mind. Once the
3327 symbolic link is in place, the attacker has persuaded @code{find} to
3328 cause the deletion of the @file{/etc/passwd} file, which is not the
3329 effect intended by the command which was actually invoked.
3331 One possible defence against this type of attack is to modify the
3332 behaviour of @samp{-exec} so that the @code{/bin/rm} command is run
3333 with the argument @file{./passwd} and a suitable choice of working
3334 directory. This would allow the normal sanity check that @code{find}
3335 performs to protect against this form of attack too. Unfortunately,
3336 this strategy cannot be used as the POSIX standard specifies that the
3337 current working directory for commands invoked via @samp{-exec} must
3338 be the same as the current working directory from which @code{find}
3339 was invoked. This means that the @samp{-exec} action is inherently
3340 insecure and can't be fixed.
3342 GNU @code{find} implements a more secure variant of the @samp{-exec}
3343 action, @samp{-execdir}. The @samp{-execdir} action
3344 ensures that it is not necessary to dereference subdirectories to
3345 process target files. The current directory used to invoke programs
3346 is the same as the directory in which the file to be processed exists
3347 (@file{/tmp/umsp} in our example, and only the basename of the file to
3348 be processed is passed to the invoked command, with a @samp{./}
3349 prepended (giving @file{./passwd} in our example).
3351 The @samp{-execdir} action refuses to do anything if the current
3352 directory is included in the @var{$PATH} environment variable. This
3353 is necessary because @samp{-execdir} runs programs in the same
3354 directory in which it finds files -- in general, such a directory
3355 might be writable by untrusted users. For similar reasons,
3356 @samp{-execdir} does not allow @samp{@{@}} to appear in the name of
3357 the command to be run.
3359 @node Race Conditions with -print and -print0
3360 @subsection Race Conditions with -print and -print0
3362 The @samp{-print} and @samp{-print0} actions can be used to produce a
3363 list of files matching some criteria, which can then be used with some
3364 other command, perhaps with @code{xargs}. Unfortunately, this means
3365 that there is an unavoidable time gap between @code{find} deciding
3366 that one or more files meet its criteria and the relevant command
3367 being executed. For this reason, the @samp{-print} and @samp{-print0}
3368 actions are just as insecure as @samp{-exec}.
3370 In fact, since the construction
3373 find .... -print | xargs ....
3376 does not cope correctly with newlines or other ``white space'' in
3377 filenames, and copes poorly with filenames containing quotes, the
3378 @samp{-print} action is less secure even than @samp{-print0}.
3381 @comment node-name, next, previous, up
3382 @comment @node Security Considerations for xargs
3383 @node Security Considerations for xargs
3384 @section Security Considerations for @code{xargs}
3386 The description of the race conditions affecting the @samp{-print}
3387 action of @code{find} shows that @code{xargs} cannot be secure if it
3388 is possible for an attacker to modify a filesystem after @code{find}
3389 has started but before @code{xargs} has completed all its actions.
3391 However, there are other security issues that exist even if it is not
3392 possible for an attacker to have access to the filesystem in real
3393 time. Firstly, if it is possible for an attacker to create files with
3394 names of their own choice on the filesystem, then @code{xargs} is
3395 insecure unless the @samp{-0} option is used. If a file with the name
3396 @file{/home/someuser/foo/bar\n/etc/passwd} exists (assume that
3397 @samp{\n} stands for a newline character), then @code{find ... -print}
3398 can be persuaded to print three separate lines:
3401 /home/someuser/foo/bar
3406 If it finds a blank line in the input, @code{xargs} will ignore it.
3407 Therefore, if some action is to be taken on the basis of this list of
3408 files, the @file{/etc/passwd} file would be included even if this was
3409 not the intent of the person running find. There are circumstances in
3410 which an attacker can use this to their advantage. The same
3411 consideration applies to filenames containing ordinary spaces rather
3412 than newlines, except that of course the list of filenames will no
3413 longer contain an ``extra'' newline.
3415 This problem is an unavoidable consequence of the default behaviour of
3416 the @code{xargs} command, which is specified by the POSIX standard.
3417 The only ways to avoid this problem are either to avoid all use of
3418 @code{xargs} in favour for example of @samp{find -exec} or (where
3419 available) @samp{find -execdir}, or to use the @samp{-0} option, which
3420 ensures that @code{xargs} considers filenames to be separated by ASCII
3421 NUL characters rather than whitespace. However, useful though this
3422 option is, the POSIX standard does not make it mandatory.
3424 @comment node-name, next, previous, up
3425 @node Security Considerations for locate
3426 @section Security Considerations for @code{locate}
3428 It is fairly unusual for the output of @code{locate} to be fed into
3429 another command. However, if this were to be done, this would raise
3430 the same set of security issues as the use of @samp{find ... -print}.
3431 Although the problems relating to whitespace in filenames can be
3432 resolved by using @code{locate}'s @samp{-0} option, this still leaves
3433 the race condition problems associated with @samp{find ... -print0}.
3434 There is no way to avoid these problems in the case of @code{locate}.
3436 @node Security Summary
3439 Where untrusted parties can create files on the system, or affect the
3440 names of files that are created, all uses for @code{find},
3441 @code{locate} and @code{xargs} have known security problems except the
3445 @item Informational use only
3446 Uses where the programs are used to prepare lists of filenames upon which no further action will ever be taken.
3449 Use of the @samp{-delete} action to delete files which meet
3453 Use of the @samp{-execdir} action where the @env{PATH}
3454 environment variable contains directories which contain only trusted
3458 @comment node-name, next, previous, up
3459 @node Error Messages, Primary Index, Security Considerations, Top
3460 @chapter Error Messages
3462 This section describes some of the error messages you might get from
3463 @code{find}, @code{xargs}, or @code{locate}, explains them and in some
3464 cases provides advice as to what you should do about this.
3466 This manual is written in English. The GNU findutils software
3467 features translated error messages for many languages. For this
3468 reason where possible we try to make the error messages produced by
3469 the programs self-explanatory. This approach avoids asking people to
3470 figure out which English-language error message the test they actually
3471 saw might correspond to. Error messages which are self-explanatory
3472 will not normally be described or discussed in this document. For
3473 those messages which are discussed in this document, only the
3474 English-language version of the message will be listed.
3477 * Error Messages From find::
3478 * Error Messages From xargs::
3479 * Error Messages From locate::
3480 * Error Messages From updatedb::
3483 @node Error Messages From find, Error Messages From xargs, , Error Messages
3484 @section Error Messages From find
3487 @item invalid predicate `-foo'
3488 This means that the @code{find} command line included something that
3489 started with a dash or other special character. The @code{find}
3490 program tried to interpret this as a test, action or option, but
3491 didn't recognise it. If you intended it to be a test, check what you
3492 specified against the documentation. If, on the other hand, the
3493 string is the name of a file which has been expanded from a wildcard
3494 (for example because you have a @samp{*} on the command line),
3495 consider using @samp{./*} or just @samp{.} instead.
3497 @item unexpected extra predicate
3498 This usually happens if you have an extra bracket on the command line
3499 (for example @samp{find . -print \)}).
3501 @item Warning: filesystem /path/foo has recently been mounted
3502 @itemx Warning: filesystem /path/foo has recently been unmounted
3503 These messages might appear when @code{find} moves into a directory
3504 and finds that the device number and inode are different to what it
3505 expected them to be. If the directory @code{find} has moved into is
3506 on an NFS filesystem, it will not issue this message, because
3507 @code{automount} frequently mounts new filesystems on directories as
3508 you move into them (that is how it knows you want to use the
3509 filesystem). So, if you do see this message, be wary --
3510 @code{automount} may not have been responsible. Consider the
3511 possibility that someone else is manipulating the filesystem while
3512 @code{find} is running. Some people might do this in order to mislead
3513 @code{find} or persuade it to look at one set of files when it thought
3514 it was looking at another set.
3516 @item /path/foo changed during execution of find (old device number 12345, new device number 6789, filesystem type is <whatever>) [ref XXX]
3517 This message is issued when @code{find} changes directory and ends up
3518 somewhere it didn't expect to be. This happens in one of two
3519 circumstances. Firstly this happens when ``automount'' does its thing
3520 on a system where @code{find} doesn't know how to determine what the
3521 current set of mounted filesystems is
3523 Secondly, this can happen when the device number of a directory
3524 appears to change during a change of current directory, but
3525 @code{find} is moving up the filesystem hierarchy rather than down it.
3526 In order to prevent @code{find} wandering off into some unexpected
3527 part of the filesystem, we stop it at this point.
3529 @item Don't know how to use getmntent() to read `/etc/mtab'. This is a bug.
3530 This message is issued when a problem similar to the above occurs on a
3531 system where @code{find} doesn't know how to figure out the current
3532 list of mount points. Ask for help on @email{bug-findutils@@gnu.org}.
3534 @item /path/foo/bar changed during execution of find (old inode number 12345, new inode number 67893, filesystem type is <whatever>) [ref XXX]"),
3535 This message is issued when @code{find} changes directory and
3536 discovers that the inode number of that directory once it's got there
3537 is different to the inode number that it obtained when it examined the
3538 directory some time previously. This normally means that while
3539 @code{find} has been deep in a directory hierarchy doing something
3540 time consuming, somebody has moved the one of the parent directories
3541 to another location in the same filesystem. This may have been done
3542 maliciously, or may not. In any case, @code{find} stops at this point
3543 in order to avoid traversing parts of the filesystem that it wasn't
3544 intended to. You can use @code{ls -li} or @code{find /path -inum
3545 12345 -o -inum 67893} to find out more about what has happened.
3547 @item sanity check of the fnmatch() library function failed.
3548 Please submit a bug report. You may well be asked questions about
3549 your system, and if you compiled the @code{findutils} code yourself,
3550 you should keep your copy of the build tree around. The likely
3551 explanation is that your system has a buggy implementation of
3552 @code{fnmatch} that looks enough like the GNU version to fool
3553 @code{configure}, but which doesn't work properly.
3556 This normally happens if you use the @code{-exec} action or a
3557 something similar (@code{-ok} and so forth) but the system has run out
3558 of free process slots. This is either because the system is very busy
3559 and the system has reached its maximum process limit, or because you
3560 have a resource limit in place and you've reached it. Check the
3561 system for runaway processes (if @code{ps} still works). Some process
3562 slots are normally reserved for use by @samp{root}.
3564 @item some-program terminated by signal 99
3565 Some program which was launched via @code{-exec} or similar was killed
3566 with a fatal signal. This is just an advisory message.
3571 @node Error Messages From xargs, Error Messages From locate, Error Messages From find, Error Messages
3572 @section Error Messages From xargs
3575 @item environment is too large for exec
3576 This message means that you have so many environment variables set
3577 (or such large values for them) that there is no room within the
3578 system-imposed limits on program command-line argument length to
3579 invoke any program. I'm sure you did this deliberately. Please try
3580 unsetting some environment variables, or exiting the current shell.
3582 @item can not fit single argument within argument list size limit
3583 You are using the @samp{-i} option and @code{xargs} doesn't have
3584 enough space to build a command line because it has read in a really
3585 large item and it doesn't fit. You can probably work around this
3586 problem with the @samp{-s} option, but the default size is pretty
3587 large. You must be trying pretty hard to break @code{xargs}.
3590 See the description of the similar message for @code{find}.
3592 @item <program>: exited with status 255; aborting
3593 When a command run by @code{xargs} exits with status 255, @code{xargs}
3594 is supposed to stop. If this is not what you intended, wrap the
3595 program you are trying to invoke in a shell script which doesn't
3598 @item <program>: terminated by signal 99
3599 See the description of the similar message for @code{find}.
3602 @node Error Messages From locate, Error Messages From updatedb, Error Messages From xargs, Error Messages
3603 @section Error Messages From locate
3606 @item warning: database `/usr/local/var/locatedb' is more than 8 days old
3607 The @code{locate} program relies on a database which is periodically
3608 built by the @code{updatedb} program. That hasn't happened in a long
3609 time. To fix this problem, run @code{updatedb} manually. This can
3610 often happen on systems that are generally not left on, so the periodic
3611 ``cron'' task which normally does this doesn't get a chance to run.
3613 @item locate database `/usr/local/var/locatedb' is corrupt or invalid
3614 This should not happen. Re-run @code{updatedb}. If that works, but
3615 @code{locate} still produces this error, run @code{locate --version}
3616 and @code{updatedb --version}. These should produce the same output.
3617 If not, you are using a mixed toolset; check your @samp{$PATH}
3618 environment variable and your shell aliases (if you have any). If
3619 both programs claim to be GNU versions, this is a bug; all versions of
3620 these programs should interoperate without problem. Ask for help on
3621 @email{bug-findutils@@gnu.org}.
3625 @node Error Messages From updatedb, , Error Messages From locate, Error Messages
3626 @section Error Messages From updatedb
3628 The @code{updatedb} program (and the programs it invokes) do issue
3629 error messages, but none of them seem to me to be candidates for
3630 guidance. If you are having a problem understanding one of these, ask
3631 for help on @email{bug-findutils@@gnu.org}.
3633 @node Primary Index, , Error Messages, Top
3634 @unnumbered @code{find} Primary Index
3636 This is a list of all of the primaries (tests, actions, and options)
3637 that make up @code{find} expressions for selecting files. @xref{find
3638 Expressions}, for more information on expressions.
3644 @comment texi related words used by Emacs' spell checker ispell.el
3646 @comment LocalWords: texinfo setfilename settitle setchapternewpage
3647 @comment LocalWords: iftex finalout ifinfo DIR titlepage vskip pt
3648 @comment LocalWords: filll dir samp dfn noindent xref pxref
3649 @comment LocalWords: var deffn texi deffnx itemx emph asis
3650 @comment LocalWords: findex smallexample subsubsection cindex
3651 @comment LocalWords: dircategory direntry itemize
3653 @comment other words used by Emacs' spell checker ispell.el
3654 @comment LocalWords: README fred updatedb xargs Plett Rendell akefile
3655 @comment LocalWords: args grep Filesystems fo foo fOo wildcards iname
3656 @comment LocalWords: ipath regex iregex expr fubar regexps
3657 @comment LocalWords: metacharacters macs sr sc inode lname ilname
3658 @comment LocalWords: sysdep noleaf ls inum xdev filesystems usr atime
3659 @comment LocalWords: ctime mtime amin cmin mmin al daystart Sladkey rm
3660 @comment LocalWords: anewer cnewer bckw rf xtype uname gname uid gid
3661 @comment LocalWords: nouser nogroup chown chgrp perm ch maxdepth
3662 @comment LocalWords: mindepth cpio src CD AFS statted stat fstype ufs
3663 @comment LocalWords: nfs tmp mfs printf fprint dils rw djm Nov lwall
3664 @comment LocalWords: POSIXLY fls fprintf strftime locale's EDT GMT AP
3665 @comment LocalWords: EST diff perl backquotes sprintf Falstad Oct cron
3666 @comment LocalWords: eg vmunix mkdir afs allexec allwrite ARG bigram
3667 @comment LocalWords: bigrams cd chmod comp crc CVS dbfile dum eof
3668 @comment LocalWords: fileserver filesystem fn frcode Ghazi Hnewc iXX
3669 @comment LocalWords: joeuser Kaveh localpaths localuser LOGNAME
3670 @comment LocalWords: Meyering mv netpaths netuser nonblank nonblanks
3671 @comment LocalWords: ois ok Pinard printindex proc procs prunefs
3672 @comment LocalWords: prunepaths pwd RFS rmadillo rmdir rsh sbins str
3673 @comment LocalWords: su Timar ubins ug unstripped vf VM Weitzel
3674 @comment LocalWords: wildcard zlogout basename execdir wholename iwholename
3675 @comment LocalWords: timestamp timestamps Solaris FreeBSD OpenBSD POSIX