Install Perl 5.8.8
[msysgit.git] / mingw / html / lib / File / Find.html
blob24ca4c4b595698cdc7fd3d338ba736883dc220de
1 <?xml version="1.0" ?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml">
4 <head>
5 <title>File::Find - Traverse a directory tree.</title>
6 <meta http-equiv="content-type" content="text/html; charset=utf-8" />
7 <link rev="made" href="mailto:" />
8 </head>
10 <body style="background-color: white">
11 <table border="0" width="100%" cellspacing="0" cellpadding="3">
12 <tr><td class="block" style="background-color: #cccccc" valign="middle">
13 <big><strong><span class="block">&nbsp;File::Find - Traverse a directory tree.</span></strong></big>
14 </td></tr>
15 </table>
17 <p><a name="__index__"></a></p>
18 <!-- INDEX BEGIN -->
20 <ul>
22 <li><a href="#name">NAME</a></li>
23 <li><a href="#synopsis">SYNOPSIS</a></li>
24 <li><a href="#description">DESCRIPTION</a></li>
25 <ul>
27 <li><a href="#_options">%options</a></li>
28 <li><a href="#the_wanted_function">The wanted function</a></li>
29 </ul>
31 <li><a href="#warnings">WARNINGS</a></li>
32 <li><a href="#caveat">CAVEAT</a></li>
33 <li><a href="#notes">NOTES</a></li>
34 <li><a href="#bugs_and_caveats">BUGS AND CAVEATS</a></li>
35 <li><a href="#history">HISTORY</a></li>
36 </ul>
37 <!-- INDEX END -->
39 <hr />
40 <p>
41 </p>
42 <h1><a name="name">NAME</a></h1>
43 <p>File::Find - Traverse a directory tree.</p>
44 <p>
45 </p>
46 <hr />
47 <h1><a name="synopsis">SYNOPSIS</a></h1>
48 <pre>
49 use File::Find;
50 find(\&amp;wanted, @directories_to_search);
51 sub wanted { ... }</pre>
52 <pre>
53 use File::Find;
54 finddepth(\&amp;wanted, @directories_to_search);
55 sub wanted { ... }</pre>
56 <pre>
57 use File::Find;
58 find({ wanted =&gt; \&amp;process, follow =&gt; 1 }, '.');</pre>
59 <p>
60 </p>
61 <hr />
62 <h1><a name="description">DESCRIPTION</a></h1>
63 <p>These are functions for searching through directory trees doing work
64 on each file found similar to the Unix <em>find</em> command. File::Find
65 exports two functions, <a href="#item_find"><code>find</code></a> and <a href="#item_finddepth"><code>finddepth</code></a>. They work similarly
66 but have subtle differences.</p>
67 <dl>
68 <dt><strong><a name="item_find"><strong>find</strong></a></strong>
70 <dd>
71 <pre>
72 find(\&amp;wanted, @directories);
73 find(\%options, @directories);</pre>
74 </dd>
75 <dd>
76 <p><a href="#item_find"><code>find()</code></a> does a depth-first search over the given <code>@directories</code> in
77 the order they are given. For each file or directory found, it calls
78 the <code>&amp;wanted</code> subroutine. (See below for details on how to use the
79 <code>&amp;wanted</code> function). Additionally, for each directory found, it will
80 <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_chdir"><code>chdir()</code></a> into that directory and continue the search, invoking the
81 <code>&amp;wanted</code> function on each file or subdirectory in the directory.</p>
82 </dd>
83 <dt><strong><a name="item_finddepth"><strong>finddepth</strong></a></strong>
85 <dd>
86 <pre>
87 finddepth(\&amp;wanted, @directories);
88 finddepth(\%options, @directories);</pre>
89 </dd>
90 <dd>
91 <p><a href="#item_finddepth"><code>finddepth()</code></a> works just like <a href="#item_find"><code>find()</code></a> except that is invokes the
92 <code>&amp;wanted</code> function for a directory <em>after</em> invoking it for the
93 directory's contents. It does a postorder traversal instead of a
94 preorder traversal, working from the bottom of the directory tree up
95 where <a href="#item_find"><code>find()</code></a> works from the top of the tree down.</p>
96 </dd>
97 </dl>
98 <p>
99 </p>
100 <h2><a name="_options">%options</a></h2>
101 <p>The first argument to <a href="#item_find"><code>find()</code></a> is either a code reference to your
102 <code>&amp;wanted</code> function, or a hash reference describing the operations
103 to be performed for each file. The
104 code reference is described in <a href="#the_wanted_function">The wanted function</a> below.</p>
105 <p>Here are the possible keys for the hash:</p>
106 <dl>
107 <dt><strong><a name="item_wanted"><code>wanted</code></a></strong>
109 <dd>
110 <p>The value should be a code reference. This code reference is
111 described in <a href="#the_wanted_function">The wanted function</a> below.</p>
112 </dd>
113 </li>
114 <dt><strong><a name="item_bydepth"><code>bydepth</code></a></strong>
116 <dd>
117 <p>Reports the name of a directory only AFTER all its entries
118 have been reported. Entry point <a href="#item_finddepth"><code>finddepth()</code></a> is a shortcut for
119 specifying <code>&lt;{ bydepth =</code> 1 }&gt;&gt; in the first argument of <a href="#item_find"><code>find()</code></a>.</p>
120 </dd>
121 </li>
122 <dt><strong><a name="item_preprocess"><code>preprocess</code></a></strong>
124 <dd>
125 <p>The value should be a code reference. This code reference is used to
126 preprocess the current directory. The name of the currently processed
127 directory is in <a href="#item__file__find__dir"><code>$File::Find::dir</code></a>. Your preprocessing function is
128 called after <code>readdir()</code>, but before the loop that calls the <a href="#item_wanted"><code>wanted()</code></a>
129 function. It is called with a list of strings (actually file/directory
130 names) and is expected to return a list of strings. The code can be
131 used to sort the file/directory names alphabetically, numerically,
132 or to filter out directory entries based on their name alone. When
133 <em>follow</em> or <em>follow_fast</em> are in effect, <a href="#item_preprocess"><code>preprocess</code></a> is a no-op.</p>
134 </dd>
135 </li>
136 <dt><strong><a name="item_postprocess"><code>postprocess</code></a></strong>
138 <dd>
139 <p>The value should be a code reference. It is invoked just before leaving
140 the currently processed directory. It is called in void context with no
141 arguments. The name of the current directory is in <a href="#item__file__find__dir"><code>$File::Find::dir</code></a>. This
142 hook is handy for summarizing a directory, such as calculating its disk
143 usage. When <em>follow</em> or <em>follow_fast</em> are in effect, <a href="#item_postprocess"><code>postprocess</code></a> is a
144 no-op.</p>
145 </dd>
146 </li>
147 <dt><strong><a name="item_follow"><code>follow</code></a></strong>
149 <dd>
150 <p>Causes symbolic links to be followed. Since directory trees with symbolic
151 links (followed) may contain files more than once and may even have
152 cycles, a hash has to be built up with an entry for each file.
153 This might be expensive both in space and time for a large
154 directory tree. See <em>follow_fast</em> and <em>follow_skip</em> below.
155 If either <em>follow</em> or <em>follow_fast</em> is in effect:</p>
156 </dd>
157 <ul>
158 <li>
159 <p>It is guaranteed that an <em>lstat</em> has been called before the user's
160 <a href="#item_wanted"><code>wanted()</code></a> function is called. This enables fast file checks involving _.
161 Note that this guarantee no longer holds if <em>follow</em> or <em>follow_fast</em>
162 are not set.</p>
163 </li>
164 <li>
165 <p>There is a variable <code>$File::Find::fullname</code> which holds the absolute
166 pathname of the file with all symbolic links resolved. If the link is
167 a dangling symbolic link, then fullname will be set to <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_undef"><code>undef</code></a>.</p>
168 </li>
169 </ul>
170 <p>This is a no-op on Win32.</p>
171 <dt><strong><a name="item_follow_fast"><code>follow_fast</code></a></strong>
173 <dd>
174 <p>This is similar to <em>follow</em> except that it may report some files more
175 than once. It does detect cycles, however. Since only symbolic links
176 have to be hashed, this is much cheaper both in space and time. If
177 processing a file more than once (by the user's <a href="#item_wanted"><code>wanted()</code></a> function)
178 is worse than just taking time, the option <em>follow</em> should be used.</p>
179 </dd>
180 <dd>
181 <p>This is also a no-op on Win32.</p>
182 </dd>
183 </li>
184 <dt><strong><a name="item_follow_skip"><code>follow_skip</code></a></strong>
186 <dd>
187 <p><code>follow_skip==1</code>, which is the default, causes all files which are
188 neither directories nor symbolic links to be ignored if they are about
189 to be processed a second time. If a directory or a symbolic link
190 are about to be processed a second time, File::Find dies.</p>
191 </dd>
192 <dd>
193 <p><code>follow_skip==0</code> causes File::Find to die if any file is about to be
194 processed a second time.</p>
195 </dd>
196 <dd>
197 <p><code>follow_skip==2</code> causes File::Find to ignore any duplicate files and
198 directories but to proceed normally otherwise.</p>
199 </dd>
200 </li>
201 <dt><strong><a name="item_dangling_symlinks"><code>dangling_symlinks</code></a></strong>
203 <dd>
204 <p>If true and a code reference, will be called with the symbolic link
205 name and the directory it lives in as arguments. Otherwise, if true
206 and warnings are on, warning ``symbolic_link_name is a dangling
207 symbolic link\n'' will be issued. If false, the dangling symbolic link
208 will be silently ignored.</p>
209 </dd>
210 </li>
211 <dt><strong><a name="item_no_chdir"><code>no_chdir</code></a></strong>
213 <dd>
214 <p>Does not <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_chdir"><code>chdir()</code></a> to each directory as it recurses. The <a href="#item_wanted"><code>wanted()</code></a>
215 function will need to be aware of this, of course. In this case,
216 <a href="#item___"><code>$_</code></a> will be the same as <a href="#item__file__find__name"><code>$File::Find::name</code></a>.</p>
217 </dd>
218 </li>
219 <dt><strong><a name="item_untaint"><code>untaint</code></a></strong>
221 <dd>
222 <p>If find is used in taint-mode (-T command line switch or if EUID != UID
223 or if EGID != GID) then internally directory names have to be untainted
224 before they can be chdir'ed to. Therefore they are checked against a regular
225 expression <em>untaint_pattern</em>. Note that all names passed to the user's
226 <em>wanted()</em> function are still tainted. If this option is used while
227 not in taint-mode, <a href="#item_untaint"><code>untaint</code></a> is a no-op.</p>
228 </dd>
229 </li>
230 <dt><strong><a name="item_untaint_pattern"><code>untaint_pattern</code></a></strong>
232 <dd>
233 <p>See above. This should be set using the <code>qr</code> quoting operator.
234 The default is set to <code>qr|^([-+@\w./]+)$|</code>.
235 Note that the parentheses are vital.</p>
236 </dd>
237 </li>
238 <dt><strong><a name="item_untaint_skip"><code>untaint_skip</code></a></strong>
240 <dd>
241 <p>If set, a directory which fails the <em>untaint_pattern</em> is skipped,
242 including all its sub-directories. The default is to 'die' in such a case.</p>
243 </dd>
244 </li>
245 </dl>
247 </p>
248 <h2><a name="the_wanted_function">The wanted function</a></h2>
249 <p>The <a href="#item_wanted"><code>wanted()</code></a> function does whatever verifications you want on
250 each file and directory. Note that despite its name, the <a href="#item_wanted"><code>wanted()</code></a>
251 function is a generic callback function, and does <strong>not</strong> tell
252 File::Find if a file is ``wanted'' or not. In fact, its return value
253 is ignored.</p>
254 <p>The wanted function takes no arguments but rather does its work
255 through a collection of variables.</p>
256 <dl>
257 <dt><strong><a name="item__file__find__dir"><code>$File::Find::dir</code> is the current directory name,</a></strong>
259 <dt><strong><a name="item___"><code>$_</code> is the current filename within that directory</a></strong>
261 <dt><strong><a name="item__file__find__name"><code>$File::Find::name</code> is the complete pathname to the file.</a></strong>
263 </dl>
264 <p>Don't modify these variables.</p>
265 <p>For example, when examining the file <em>/some/path/foo.ext</em> you will have:</p>
266 <pre>
267 $File::Find::dir = /some/path/
268 $_ = foo.ext
269 $File::Find::name = /some/path/foo.ext</pre>
270 <p>You are chdir()'d to <a href="#item__file__find__dir"><code>$File::Find::dir</code></a> when the function is called,
271 unless <a href="#item_no_chdir"><code>no_chdir</code></a> was specified. Note that when changing to
272 directories is in effect the root directory (<em>/</em>) is a somewhat
273 special case inasmuch as the concatenation of <a href="#item__file__find__dir"><code>$File::Find::dir</code></a>,
274 <code>'/'</code> and <a href="#item___"><code>$_</code></a> is not literally equal to <a href="#item__file__find__name"><code>$File::Find::name</code></a>. The
275 table below summarizes all variants:</p>
276 <pre>
277 $File::Find::name $File::Find::dir $_
278 default / / .
279 no_chdir=&gt;0 /etc / etc
280 /etc/x /etc x</pre>
281 <pre>
282 no_chdir=&gt;1 / / /
283 /etc / /etc
284 /etc/x /etc /etc/x</pre>
285 <p>When &lt;follow&gt; or &lt;follow_fast&gt; are in effect, there is
286 also a <code>$File::Find::fullname</code>. The function may set
287 <code>$File::Find::prune</code> to prune the tree unless <a href="#item_bydepth"><code>bydepth</code></a> was
288 specified. Unless <a href="#item_follow"><code>follow</code></a> or <a href="#item_follow_fast"><code>follow_fast</code></a> is specified, for
289 compatibility reasons (find.pl, find2perl) there are in addition the
290 following globals available: <code>$File::Find::topdir</code>,
291 <code>$File::Find::topdev</code>, <code>$File::Find::topino</code>,
292 <code>$File::Find::topmode</code> and <code>$File::Find::topnlink</code>.</p>
293 <p>This library is useful for the <code>find2perl</code> tool, which when fed,</p>
294 <pre>
295 find2perl / -name .nfs\* -mtime +7 \
296 -exec rm -f {} \; -o -fstype nfs -prune</pre>
297 <p>produces something like:</p>
298 <pre>
299 sub wanted {
300 /^\.nfs.*\z/s &amp;&amp;
301 (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_)) &amp;&amp;
302 int(-M _) &gt; 7 &amp;&amp;
303 unlink($_)
305 ($nlink || (($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))) &amp;&amp;
306 $dev &lt; 0 &amp;&amp;
307 ($File::Find::prune = 1);
308 }</pre>
309 <p>Notice the <code>_</code> in the above <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_int"><code>int(-M _)</code></a>: the <code>_</code> is a magical
310 filehandle that caches the information from the preceding
311 <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_stat"><code>stat()</code></a>, <a href="file://C|\msysgit\mingw\html/pod/perlfunc.html#item_lstat"><code>lstat()</code></a>, or filetest.</p>
312 <p>Here's another interesting wanted function. It will find all symbolic
313 links that don't resolve:</p>
314 <pre>
315 sub wanted {
316 -l &amp;&amp; !-e &amp;&amp; print &quot;bogus link: $File::Find::name\n&quot;;
317 }</pre>
318 <p>See also the script <code>pfind</code> on CPAN for a nice application of this
319 module.</p>
321 </p>
322 <hr />
323 <h1><a name="warnings">WARNINGS</a></h1>
324 <p>If you run your program with the <code>-w</code> switch, or if you use the
325 <code>warnings</code> pragma, File::Find will report warnings for several weird
326 situations. You can disable these warnings by putting the statement</p>
327 <pre>
328 no warnings 'File::Find';</pre>
329 <p>in the appropriate scope. See <a href="file://C|\msysgit\mingw\html/pod/perllexwarn.html">the perllexwarn manpage</a> for more info about lexical
330 warnings.</p>
332 </p>
333 <hr />
334 <h1><a name="caveat">CAVEAT</a></h1>
335 <dl>
336 <dt><strong><a name="item__dont_use_nlink">$dont_use_nlink</a></strong>
338 <dd>
339 <p>You can set the variable <code>$File::Find::dont_use_nlink</code> to 1, if you want to
340 force File::Find to always stat directories. This was used for file systems
341 that do not have an <code>nlink</code> count matching the number of sub-directories.
342 Examples are ISO-9660 (CD-ROM), AFS, HPFS (OS/2 file system), FAT (DOS file
343 system) and a couple of others.</p>
344 </dd>
345 <dd>
346 <p>You shouldn't need to set this variable, since File::Find should now detect
347 such file systems on-the-fly and switch itself to using stat. This works even
348 for parts of your file system, like a mounted CD-ROM.</p>
349 </dd>
350 <dd>
351 <p>If you do set <code>$File::Find::dont_use_nlink</code> to 1, you will notice slow-downs.</p>
352 </dd>
353 </li>
354 <dt><strong><a name="item_symlinks">symlinks</a></strong>
356 <dd>
357 <p>Be aware that the option to follow symbolic links can be dangerous.
358 Depending on the structure of the directory tree (including symbolic
359 links to directories) you might traverse a given (physical) directory
360 more than once (only if <a href="#item_follow_fast"><code>follow_fast</code></a> is in effect).
361 Furthermore, deleting or changing files in a symbolically linked directory
362 might cause very unpleasant surprises, since you delete or change files
363 in an unknown directory.</p>
364 </dd>
365 </li>
366 </dl>
368 </p>
369 <hr />
370 <h1><a name="notes">NOTES</a></h1>
371 <ul>
372 <li>
373 <p>Mac OS (Classic) users should note a few differences:</p>
374 <ul>
375 <li>
376 <p>The path separator is ':', not '/', and the current directory is denoted
377 as ':', not '.'. You should be careful about specifying relative pathnames.
378 While a full path always begins with a volume name, a relative pathname
379 should always begin with a ':'. If specifying a volume name only, a
380 trailing ':' is required.</p>
381 </li>
382 <li>
383 <p><a href="#item__file__find__dir"><code>$File::Find::dir</code></a> is guaranteed to end with a ':'. If <a href="#item___"><code>$_</code></a>
384 contains the name of a directory, that name may or may not end with a
385 ':'. Likewise, <a href="#item__file__find__name"><code>$File::Find::name</code></a>, which contains the complete
386 pathname to that directory, and <code>$File::Find::fullname</code>, which holds
387 the absolute pathname of that directory with all symbolic links resolved,
388 may or may not end with a ':'.</p>
389 </li>
390 <li>
391 <p>The default <a href="#item_untaint_pattern"><code>untaint_pattern</code></a> (see above) on Mac OS is set to
392 <code>qr|^(.+)$|</code>. Note that the parentheses are vital.</p>
393 </li>
394 <li>
395 <p>The invisible system file ``Icon\015'' is ignored. While this file may
396 appear in every directory, there are some more invisible system files
397 on every volume, which are all located at the volume root level (i.e.
398 ``MacintoshHD:''). These system files are <strong>not</strong> excluded automatically.
399 Your filter may use the following code to recognize invisible files or
400 directories (requires Mac::Files):</p>
401 <pre>
402 use Mac::Files;</pre>
403 <pre>
404 # invisible() -- returns 1 if file/directory is invisible,
405 # 0 if it's visible or undef if an error occurred</pre>
406 <pre>
407 sub invisible($) {
408 my $file = shift;
409 my ($fileCat, $fileInfo);
410 my $invisible_flag = 1 &lt;&lt; 14;</pre>
411 <pre>
412 if ( $fileCat = FSpGetCatInfo($file) ) {
413 if ($fileInfo = $fileCat-&gt;ioFlFndrInfo() ) {
414 return (($fileInfo-&gt;fdFlags &amp; $invisible_flag) &amp;&amp; 1);
417 return undef;
418 }</pre>
419 <p>Generally, invisible files are system files, unless an odd application
420 decides to use invisible files for its own purposes. To distinguish
421 such files from system files, you have to look at the <strong>type</strong> and <strong>creator</strong>
422 file attributes. The MacPerl built-in functions <code>GetFileInfo(FILE)</code> and
423 <code>SetFileInfo(CREATOR, TYPE, FILES)</code> offer access to these attributes
424 (see MacPerl.pm for details).</p>
425 <p>Files that appear on the desktop actually reside in an (hidden) directory
426 named ``Desktop Folder'' on the particular disk volume. Note that, although
427 all desktop files appear to be on the same ``virtual'' desktop, each disk
428 volume actually maintains its own ``Desktop Folder'' directory.</p>
429 </li>
430 </ul>
431 </ul>
433 </p>
434 <hr />
435 <h1><a name="bugs_and_caveats">BUGS AND CAVEATS</a></h1>
436 <p>Despite the name of the <a href="#item_finddepth"><code>finddepth()</code></a> function, both <a href="#item_find"><code>find()</code></a> and
437 <a href="#item_finddepth"><code>finddepth()</code></a> perform a depth-first search of the directory
438 hierarchy.</p>
440 </p>
441 <hr />
442 <h1><a name="history">HISTORY</a></h1>
443 <p>File::Find used to produce incorrect results if called recursively.
444 During the development of perl 5.8 this bug was fixed.
445 The first fixed version of File::Find was 1.01.</p>
446 <table border="0" width="100%" cellspacing="0" cellpadding="3">
447 <tr><td class="block" style="background-color: #cccccc" valign="middle">
448 <big><strong><span class="block">&nbsp;File::Find - Traverse a directory tree.</span></strong></big>
449 </td></tr>
450 </table>
452 </body>
454 </html>