1 \input texinfo @c -*-texinfo-*-
3 @setfilename find-maint.info
4 @settitle Maintaining Findutils
5 @c For double-sided printing, uncomment:
6 @c @setchapternewpage odd
9 @include versionmaint.texi
15 @dircategory GNU organization
17 * Maintaining Findutils: (find-maint). Maintaining GNU findutils
21 This manual explains how GNU findutils is maintained, how changes should
22 be made and tested, and what resources exist to help developers.
24 This is edition @value{EDITION}, for findutils version @value{VERSION}.
26 Copyright @copyright{} 2007 Free Software Foundation, Inc.
28 Permission is granted to copy, distribute and/or modify this document
29 under the terms of the GNU Free Documentation License, Version 1.2
30 or any later version published by the Free Software Foundation;
31 with no Invariant Sections, with no
32 Front-Cover Texts, and with no Back-Cover Texts.
33 A copy of the license is included in the section entitled ``GNU
34 Free Documentation License''.
38 @title Maintaining Findutils
39 @subtitle Edition @value{EDITION}, for GNU findutils version @value{VERSION}
40 @subtitle @value{UPDATED}
41 @author by James Youngman
44 @vskip 0pt plus 1filll
51 @node Top, Introduction, (dir), (dir)
52 @top Maintaining GNU Findutils
59 * Maintaining GNU Programs::
61 * Coding Conventions::
63 * Using the GNU Portability Library::
68 * Internationalisation::
71 * GNU Free Documentation License::
81 This document explains how to contribute to and maintain GNU
82 Findutils. It concentrates on developer-specific issues. For
83 information about how to use the software please refer to
84 @xref{Introduction, ,Introduction,find,The Findutils manual}.
86 This manual aims to be useful without necessarily being verbose. It's
87 also a recent document, so there will be a many areas in which
88 improvements can be made. If you find that the document misses out
89 important information or any part of the document is be so terse as to
90 be unuseful, please ask for help on the @email{bug-findutils@@gnu.org}
91 mailing list. We'll try to improve this document too.
94 @node Maintaining GNU Programs
95 @chapter Maintaining GNU Programs
97 GNU Findutils is part of the GNU Project and so there are a number of
98 documents which set out standards for the maintenance of GNU
103 GNU Project Coding Standards. All changes to findutils should comply
104 with these standards. In some areas we go somewhat beyond the
105 requirements of the standards, but these cases are explained in this
108 Information for Maintainers of GNU Software. This document provides
109 guidance for GNU maintainers. Everybody with commit access should
110 read this document. Everybody else is welcome to do so too, of
117 @chapter Design Issues
119 The findutils package is installed on many many systems, usually as a
120 fundamental component. The programs in the package are often used in
121 order to successfully boot or fix the system.
123 This fact means that for findutils we bear in mind considerations that
124 may not apply so much as for other packages. For example, the fact
125 that findutils is often a base component motivates us to
127 @item Limit dependencies on libraries
128 @item Avoid dependencies on other large packages (for example, interpreters)
129 @item Be conservative when making changes to the 'stable' release branch
132 All those considerations come before functionality. Functional
133 enhancements are still made to findutils, but these are almost
134 exclusively introduced in the 'development' release branch, to allow
135 extensive testing and proving.
137 Sometimes it is useful to have a priority list to provide guidance
138 when making design trade-offs. For findutils, that priority list is:
142 @item Standards compliance
144 @item Backward compatibility
149 For example, we support the @code{-exec} action because POSIX
150 compliance requires this, even though there are security problems with
151 it and we would otherwise prefer people to use @code{-execdir}. There
152 are also cases where some performance is sacrificed in the name of
153 security. For example, the sanity checks that @code{find} performs
154 while traversing a directory tree may slow it down. We adopt
155 functional changes, and functional changes are allowed to make
156 @code{find} slower, but only if there is no detectable impact on users
157 who don't use the feature.
159 Backward-incompatible changes do get made in order to comply with
160 standards (for example the behaviour of @code{-perm -...} changed in
161 order to comply with POSIX). However, they don't get made in order to
162 provide better ease of use; for example the semantics of @code{-size
163 -2G} are almost always unexpected by users, but we retain the current
164 behaviour because of backward compatibility and for its similarity to
165 the block-rounding behaviour of @code{-size -30}. We might introduce
166 a change which does not have the unfortunate rounding behaviour, but
167 we would choose another syntax (for example @code{-size '<2G'}) for
170 In a general sense, we try to do test-driven development of the
171 findutils code; that is, we try to implement test cases for new
172 features and bug fixes before modifying the code to make the test
173 pass. Some features of the code are tested well, but the test
174 coverage for other features is less good. If you are about to modify
175 the code for a predicate and aren't sure about the test coverage, use
176 @code{grep} on the test directories and measure the coverage with
177 @code{gcov} or another test coverage tool.
179 Lastly, we try not to depend on having a ``working system''. The
180 findutils suite is used for diagnosis of problems, and this applies
181 especially to @code{find}. We should ensure that @code{find} still
182 works on relatively broken systems, for example systems with damaged
183 @file{/etc/passwd} files. Another interesting example is the case
184 where a system is a client of one or more unresponsive NFS servers.
185 On such a system, if you try to stat all mount points, your program
186 will hang indefinitely, waiting for the remote NFS server to respond.
190 @c Installed on many systems
191 @c Often part of base
192 @c Needs to work on broken systems (e.g. unresponsive NFS servers,
195 @node Coding Conventions
196 @chapter Coding Conventions
198 Coding style documents which set out to establish a uniform look and
199 feel to source code have worthy goals, for example greater ease of
200 maintenance and readability. However, I do not believe that in
201 general coding style guide authors can envisage every situation, and
202 it is always possible that it might on occasion be necessary to break
203 the letter of the style guide in order to honour its spirit, or to
204 better achieve the style guide's goals.
206 I've certainly seen many style guides outside the free software world
207 which make bald statements such as ``functions shall have exactly one
208 return statement''. The desire to ensure consistency and obviousness
209 of control flow is laudable, but it is all too common for such bald
210 requirements to be followed unthinkingly. Certainly I've seen such
211 coding standards result in unmaintainable code with terrible
212 infelicities such as functions containing @code{if} statements nested
213 nine levels deep. I suppose such coding standards don't survive in
214 free software projects because they tend to drive away potential
215 contributors or tend to generate heated discussions on mailing lists.
216 Equally, a nine-level-deep function in a free software program would
217 quickly get refactored, assuming it is obvious what the function is
220 Be that as it may, the approach I will take for this document is to
221 explain some idioms and practices in use in the findutils source code,
222 and leave it up to the reader's engineering judgement to decide which
223 considerations apply to the code they are working on, and whether or
224 not there is sufficient reason to ignore the guidance in current
229 * Make the Compiler Find the Bugs::
230 * The File System Is Being Modified::
231 * Don't Trust the File System Contents::
232 * Debugging is For Users Too::
233 * Factor Out Repeated Code::
236 @node Make the Compiler Find the Bugs
237 @section Make the Compiler Find the Bugs
239 Finding bugs is tedious. If I have a filesystem containing two
240 million files, and a find command line should print one million of
241 them, but in fact it misses out 1%, you can tell the program is
242 printing the wrong result only if you know the right answer for that
243 filesystem at that time. If you don't know this, you may just not
244 find out about that bug. For this reason it is important to have a
245 comprehensive test suite.
247 The test suite is of course not the only way to find the bugs. The
248 findutils source code makes liberal use of the assert macro. While on
249 the one hand these might be a performance drain, the performance
250 impact of most of these is negligible compared to the time taken to
251 fetch even one sector from a disk drive.
253 Assertions should not be used to check the results of operations which
254 may be affected by the program's external environment. For example,
255 never assert that a file could be opened successfully. Errors
256 relating to problems with the program's execution environment should
257 be diagnosed with a user-oriented error message. An assertion failure
258 should always denote a bug in the program.
260 Several programs in the findutils suite perform self-checks. See for
261 example the function @code{pred_sanity_check} in @file{find/pred.c}.
262 This is generally desirable.
264 There are also a number of small ways in which we can help the
265 compiler to find the bugs for us.
267 @subsection Constants in Equality Testing
269 It's a common error to write @code{=} when @code{==} is meant.
270 Sometimes this happens in new code and is simply due to finger
271 trouble. Sometimes it is the result of the inadvertent deletion of a
272 character. In any case, there is a subset of cases where we can
273 persuade the compiler to generate an error message when we make this
274 mistake; this is where the equality test is with a constant.
276 This is an example of a vulnerable piece of code.
283 A simple typo converts the above into
290 We've introduced a bug; the condition is always true, and the value of
291 @code{x} has been changed. However, a simple change to our practice
292 would have made us immune to this problem:
299 Usually, the Emacs keystroke @kbd{M-t} can be used to swap the operands.
302 @subsection Spelling of ASCII NUL
304 Strings in C are just sequences of characters terminated by a NUL.
305 The ASCII NUL character has the numerical value zero. It is normally
306 represented in C code as @samp{\0}. Here is a typical piece of C
313 Consider what happens if there is an unfortunate typo:
319 We have changed the meaning of our program and the compiler cannot
320 diagnose this as an error. Our string is no longer terminated. Bad
321 things will probably happen. It would be better if the compiler could
322 help us diagnose this problem.
324 In C, the type of @code{'\0'} is in fact int, not char. This provides
325 us with a simple way to avoid this error. The constant @code{0} has
326 the same value and type as the constant @code{'\0'}. However, it is
327 not as vulnerable to typos. For this reason I normally prefer to
335 @node Factor Out Repeated Code
336 @section Factor Out Repeated Code
338 Repeated code imposes a greater maintenance burden and increases the
339 exposure to bugs. For example, if you discover that something you
340 want to implement has some similarity with an existing piece of code,
341 don't cut and paste it. Instead, factor the code out. The risk of
342 cutting and pasting the code, particularly if you do this several
343 times, is that you end up with several copies of the same code.
345 If the original code had a bug, you now have N places where this needs
346 to be fixed. It's all to easy to miss some out when trying to fix the
347 bug. Equally, it's quite possible that when pasting the code into
348 some function, the pasted code was not quite adapted correctly to its
349 new environment. To pick a contrived example, perhaps it modifies a
350 global variable which it that code shouldn't be touching in its new
351 home. Worse, perhaps it makes some unstated assumption about the
352 nature of the input arguments which is in fact not true for the
353 context of the now duplicated code.
355 A good example of the use of refactoring in findutils is the
356 @code{collect_arg} function in @file{find/parser.c}. A less clear-cut
357 but larger example is the factoring out of code which would otherwise
358 have been duplicated between @file{find/find.c} and
359 @code{find/ftsfind.c}.
361 The findutils test suite is comprehensive enough that refactoring code
362 should not generally be a daunting prospect from a testing point of
363 view. Nevertheless there are some areas which are only
367 @item Tests on the ages of files
368 @item Code which deals with the values returned by operating system calls (for example handling of ENOENT)
369 @item Code dealing with OS limits (for example, limits on path length
371 @item Code relating to features not all systems have (for example
375 Please exercise caution when working in those areas.
378 @node Debugging is For Users Too
379 @section Debugging is For Users Too
381 Debug and diagnostic code is often used to verify that a program is
382 working in the way its author thinks it should be. But users are
383 often uncertain about what a program is doing, too. Exposing them a
384 little more diagnostic information can help. Much of the diagnostic
385 code in @code{find}, for example, is controlled by the @samp{-D} flag,
386 as opposed to C preprocessor directives.
388 Making diagnostic messages available to users also means that the
389 phrasing of the diagnostic messages becomes important, too.
392 @node Don't Trust the File System Contents
393 @section Don't Trust the File System Contents
395 People use @code{find} to search in directories created by other
396 people. Sometimes they do this to check to suspicious activity (for
397 example to look for new setuid binaries). This means that it would be
398 bad if @code{find} were vulnerable to, say, a security problem
399 exploitable by constructing a specially-crafted filename. The same
400 consideration would apply to @code{locate} and @code{updatedb}.
402 Henry Spencer said this well in his fifth commandment:
404 Thou shalt check the array bounds of all strings (indeed, all arrays),
405 for surely where thou typest @samp{foo} someone someday shall type
406 @samp{supercalifragilisticexpialidocious}.
409 Symbolic links can often be a problem. If @code{find} calls
410 @code{lstat} on something and discovers that it is a directory, it's
411 normal for @code{find} to recurse into it. Even if the @code{chdir}
412 system call is used immediately, there is still a window of
413 opportunity between the @code{lstat} and the @code{chdir} in which a
414 malicious person could rename the directory and substitute a symbolic
415 link to some other directory.
417 @node The File System Is Being Modified
418 @section The File System Is Being Modified
420 The filesystem gets modified while you are traversing it. For,
421 example, it's normal for files to get deleted while @code{find} is
422 traversing a directory. Issuing an error message seems helpful when a
423 file is deleted from the one directory you are interested in, but if
424 @code{find} is searching 15000 directories, such a message becomes
427 Bear in mind also that it is possible for the directory @code{find} is
428 currently searching could be moved to another point in the filesystem,
429 and that the directory in which @code{find} was started could be
432 Henry Spencer's sixth commandment is also apposite here:
434 If a function be advertised to return an error code in the event of
435 difficulties, thou shalt check for that code, yea, even though the
436 checks triple the size of thy code and produce aches in thy typing
437 fingers, for if thou thinkest ``it cannot happen to me'', the gods
438 shall surely punish thee for thy arrogance.
441 There are a lot of files out there. They come in all dates and
442 sizes. There is a condition out there in the real world to exercise
443 every bit of the code base. So we try to test that code base before
444 someone falls over a bug.
449 Most of the tools required to build findutils are mentioned in the
450 file @file{README-CVS}. We also use some other tools:
453 @item System call traces
454 Much of the execution time of find is spent waiting for filesystem
455 operations. A system call trace (for example, that provided by
456 @code{strace}) shows what system calls are being made. Using this
457 information we can work to remove unnecessary file system operations.
460 Valgrind is a tool which dynamically verifies the memory accesses a
461 program makes to ensure that they are valid (for example, that the
462 behaviour of the program does not in any way depend on the contents of
463 uninitialised memory).
466 DejaGnu is the test framework used to run the findutils test suite
467 (the @code{runtest} program is part of DejaGnu). It would be ideal if
468 everybody building @code{findutils} also ran the test suite, but many
469 people don't have DejaGnu installed. When changes are made to
470 findutils, DejaGnu is invoked a lot. @xref{Testing}, for more
474 @node Using the GNU Portability Library
475 @chapter Using the GNU Portability Library
476 The Gnulib library (@url{http://www.gnu.org/software/gnulib/}) makes a
477 variety of systems look more like a GNU/Linux system and also applies
478 a bunch of automatic bug fixes and workarounds. Some of these also
479 apply to GNU/Linux systems too. For example, the Gnulib regex
480 implementation is used when we determine that we are building on a
481 GNU libc system with a bug in the regex implementation.
484 @section How and Why we Import the Gnulib Code
485 Gnulib does not have a release process which results in a source
486 tarball you can download. Instead, the code is simply made available
487 by GIT. The code is also available via @code{git-cvspserver}, but
488 we decided not to use this, since @code{import-gnulib.sh} depended on
489 @code{cvs update -D}, which at the time @code{git-cvspserver} did not
492 GNU projects vary in how they interact with Gnulib. Many import a
493 selection of code from Gnulib into the working directory and then
494 check the updated files into the CVS repository for their project.
495 The coreutils project does this, for example.
497 At the last maintainer changeover for findutils (2003) it turned out
498 that there was a lot of material in findutils in common with Gnulib,
499 but it had not been updated in a long time. It was difficult to
500 figure out which source files were intended to track external sources
501 and which were intended to contain incompatible changes, or diverge
504 To reduce this uncertainty, I decided to treat Gnulib much like
505 Automake. Files supplied by Automake are simply absent from the
506 findutils source tree. When Automake is run with @code{automake
507 --add-missing --copy}, it adds in all the files it thinks should be
508 there which aren't there already.
510 An analogous approach is taken with Gnulib. The Gnulib code is
511 imported from the CVS repository for Gnulib with a findutils helper
512 script, @code{import-gnulib.sh}. That script fetches a copy of the
513 Gnulib code into the subdirectory @file{gnulib-git} and then runs
514 @code{gnulib-tool}. The @code{gnulib-tool} program copies the
515 required parts of Gnulib into the findutils source tree in the
516 subdirectory @file{gnulib}. This process gives us the property that
517 the code in @file{gnulib} and @code{gnulib-git} is not included in the
518 findutils CVS tree. Both directories are listed in @file{.cvsignore}
519 and so CVS ignores them.
521 Findutils does not use all the Gnulib code. The modules we need are
522 listed in the file @file{import-gnulib.config}. The same file also
523 indicates the version of Gnulib that we want to use. Since Gnulib has
524 no actual release process, we just use a date. Both
525 @file{import-gnulib.sh} and @file{import-gnulib.config} are in the
526 findutils CVS repository.
528 The upshot of all this is that we can use the findutils CVS repository
529 to track which version of Gnulib every findutils release uses. That
530 information is also provided when the user invokes a findutils program
531 with the @samp{--version} option. It also means that if a file exists
532 in the Findutils CVS repository, you can be certain that the file
533 exists in the CVS repository and is different from a similar file
534 elsewhere, it's for a reason.
536 There are a small number of exceptions to this; the standard
537 boiler-plate GNU files such as @file{ABOUT-NLS}, @file{INSTALL} and
541 @section How We Fix Gnulib Bugs
542 If we always directly import the Gnulib code directly from the CVS
543 repository in this way, it is impossible to maintain a locally
544 different copy of Gnulib. This is often a benefit in that accidental
545 version skew is prevented.
547 However, sometimes we want deliberate version skew in order to use a
548 findutils-specific patched version of a Gnulib file, for example
549 because we fixed a bug.
551 Gnulib is used by quite a number of GNU projects, and this means that
552 it gets plenty of testing. Therefore there are relatively few bugs in
553 the Gnulib code, but it does happen from time to time.
555 However, since there is no waiting around for a Gnulib source release
556 tarball, Gnulib bugs are generally fixed quickly. Here is an outline
557 of the way we would contribute a fix to Gnulib (assuming you know it
558 is not already fixed in current Gnulib CVS):
561 @item Check you already completed a copyright assignment for Gnulib
562 @item Begin with a vanilla CVS tree
563 Download the Findutils source code from CVS (or use the tree you have
565 @item Check out a copy of the Gnulib source
566 Check out a copy of the Gnulib source tree. The
567 @code{import-gnulib.sh} script may have generated a shallow git clone
568 as opposed to a normal, full clone in the directory @file{gnulib-git}.
569 This means that you may not be able to clone the repository that
570 @code{import-gnulib.sh} generates. However, you can make a normal
571 (full) clone with @code{git clone
572 git_repo="git://git.savannah.gnu.org/gnulib.git"}. Do this somewhere
573 outside the findutils source tree.
574 @item Import Gnulib from your local copy
575 The @code{import-gnulib.sh} tool has a @samp{-d} option which you can
576 use to import the code from a local copy of Gnulib.
577 @item Build findutils
578 Build findutils and run the test suite, which should pass. In our
579 example we assume you have just noticed a bug in Gnulib, not that
580 recent Gnulib changes broke the findutils regression tests.
581 @item Write a test case
582 If in fact Gnulib did break the findutils regression tests, you can probably
583 skip this step, since you already have a test case demonstrating the problem.
584 Otherwise, write a findutils test case for the bug and/or a Gnulib test case.
585 @item Fix the Gnulib bug
586 Make sure your editor follows symbolic links so that your changes to
587 @file{gnulib/...} actually affect the files in the CVS working
588 directory you checked out earlier. Observe that your test now passes.
589 @item Prepare a Gnulib patch
590 Use @code{cvs -z3 diff -upN} to prepare the patch. Write a ChangeLog
591 entry and prepend this to the patch. Check that the patch conforms
592 with the GNU coding standards, and email it to the Gnulib mailing
594 @item Wait for the patch to be applied
595 Once your bug fix has been applied, you can update your local directory
596 from git, re-import the code into Findutils (still using the @code{-d}
597 option), and re-run the tests. This verifies that the fix the Gnulib
598 team made actually fixes your problem.
599 @item Reimport the Gnulib code
600 Update the findutils file @file{import-gnulib.config} to specify git
601 commit which is after the point at which the bug fix was committed to
602 Gnulib. You can do this with @code{git rev-parse HEAD}. Finally,
603 re-import the Gnulib code directly from git by using
604 @samp{import-gnulib.sh} without the @samp{-d} option, and run the
605 tests again. This verifies that there was no remaining local change
606 that we were relying on to fix the bug. Make sure you checked
607 everything in by running @code{git status}.
611 @chapter Documentation
613 The findutils CVS tree includes several different types of
616 @section User Documentation
617 User-oriented documentation is provided as manual pages and in
619 @ref{Introduction,,Introduction,find,The Findutils manual}.
621 Please make sure both sets of documentation are updated if you make a
622 change to the code. The GNU coding standards do not normally call for
623 maintaining manual pages on the grounds of effort duplication.
624 However, the manual page format is more convenient for quick
625 reference, and so it's worth maintaining both types of documentation.
626 However, the manual pages are normally rather more terse than the
627 Texinfo documentation. The manual pages are suitable for reference
628 use, but the Texinfo manual should also include introductory and
632 @section Build Guidance
636 Describes the Free Translation Project, the translation status of
637 various GNU projects, and how to participate by translating an
640 Lists the authors of findutils.
642 The copyright license covering findutils; currently, the GNU GPL,
645 Generic installation instructions for installing GNU programs.
647 Information about how to compile findutils in particular
649 A README file which is included with testing releases of findutils.
651 Describes how to build findutils from the code in CVS.
653 Thanks for people who contributed to findutils. Generally, if
654 someone's contribution was significant enough to need a copyright
655 assignment, their name should go in here.
661 @section Release Information
664 Enumerates the user-visible change in each release. Typical changes
665 are fixed bugs, functionality changes and documentation changes.
666 Include the date when a release is made.
668 This file enumerates all changes to the findutils source code (with
669 the possible exception of @file{.cvsignore} and @code{.gitignore}
670 changes). The level of detail used for this file should be sufficient
671 to answer the questions ``what changed?'' and ``why was it changed?''.
672 If a change fixes a bug, always give the bug reference number in both
673 the @file{ChangeLog} and @file{NEWS} files and of course also in the
674 checkin message. In general, it should be possible to enumerate all
675 material changes to a function by searching for its name in
676 @file{ChangeLog}. Mention when each release is made.
681 This chapter will explain the general procedures for adding tests to
682 the test suite, and the functions defined in the findutils-specific
683 DejaGnu configuration. Where appropriate references will be made to
684 the DejaGnu documentation.
689 Bugs are logged in the Savannah bug tracker
690 @url{http://savannah.gnu.org/bugs/?group=findutils}. The tracker
691 offers several fields but their use is largely obvious. The
692 life-cycle of a bug is like this:
697 Someone, usually a maintainer, a distribution maintainer or a user,
698 creates a bug by filling in the form. They fill in field values as
699 they see fit. This will generate an email to
700 @email{bug-findutils@@gnu.org}.
703 The bug hangs around with @samp{Status=None} until someone begins to
704 work on it. At that point they set the ``Assigned To'' field and will
705 sometimes set the status to @samp{In Progress}, especially if the bug
706 will take a while to fix.
709 Quite a lot of reports are not actually bugs; for these the usual
710 procedure is to explain why the problem is not a bug, set the status
711 to @samp{Invalid} and close the bug. Make sure you set the
712 @samp{Assigned to} field to yourself before closing the bug.
715 When you commit a bug fix into CVS (or in the case of a contributed
716 patch, commit the change), mark the bug as @samp{Fixed}. Make sure
717 you include a new test case where this is relevant. If you can figure
718 out which releases are affected, please also set the @samp{Release}
719 field to the earliest release which is affected by the bug.
720 Indicate which source branch the fix is included in (for example,
721 4.2.x or 4.3.x). Don't close the bug yet.
724 When a release is made which includes the bug fix, make sure the bug
725 is listed in the NEWS file. Once the release is made, fill in the
726 @samp{Fixed Release} field and close the bug.
731 @chapter Distributions
732 Almost all GNU/Linux distributions include findutils, but only some of
733 them have a package maintainer who is a member of the mailing list.
734 Distributions don't often feed back patches to the
735 @email{bug-findutils@@gnu.org} list, but on the other hand many of
736 their patches relate only to standards for file locations and so
737 forth, and are therefore distribution specific. On an irregular basis
738 I check the current patches being used by one or two distributions,
739 but the total number of GNU/Linux distributions is large enough that
740 we could not hope to cover them all.
742 Often, bugs are raised against a distribution's bug tracker instead of
743 GNU's. Periodically (about every six months) I take a look at some
744 of the more accessible bug trackers to indicate which bugs have been
747 Many distributions include both findutils and the slocate package,
748 which provides a replacement @code{locate}.
751 @node Internationalisation
752 @chapter Internationalisation
753 Translation is essentially automated from the maintainer's point of
754 view. The TP mails the maintainer when a new PO file is available,
755 and we just download it and check it in. We copy the @file{.po} files
756 into the CVS repository. For more information, please see
757 @url{http://www.iro.umontreal.ca/translation/HTML/domain-findutils.html}.
763 See @ref{Security Considerations, ,Security Considerations,find,The
764 Findutils manual}, for a full description of the findutils approach to
765 security considerations and discussion of particular tools.
767 If someone reports a security bug publicly, we should fix this as
768 rapidly as possible. If necessary, this can mean issuing a fixed
769 release containing just the one bug fix. We try to avoid issuing
770 releases which include both significant security fixes and functional
773 Where someone reports a security problem privately, we generally try
774 to construct and test a patch without checking the intermediate code
775 in. Once everything has been tested, this allows us to commit a patch
776 and immediately make a release. The advantage of doing things this
777 way is that we avoid situations where people watching for CVS commits
778 can figure out and exploit a security problem before a fixed release
781 It's important that security problems be fixed promptly, but don't
782 rush so much that things go wrong. Make sure the new release really
783 fixes the problem. It's usually best not to include functional
784 changes in your security-fix release.
786 If the security problem is serious, send an alert to
787 @email{vendor-sec@@lst.de}. The members of the list include most
788 GNU/Linux distributions. The point of doing this is to allow them to
789 prepare to release your security fix to their customers, once the fix
790 becomes available. Here is an example alert:-
793 GNU findutils heap buffer overrun (potential privilege escalation)
795 $Revision: 1.5 $; $Date: 2007/11/29 11:07:19 $
801 GNU findutils is a set of programs which search for files on Unix-like
802 systems. It is maintained by the GNU Project of the Free Software
803 Foundation. For more information, see
804 @url{http://www.gnu.org/software/findutils}.
810 When GNU locate reads filenames from an old-format locate database,
811 they are read into a fixed-length buffer allocated on the heap.
812 Filenames longer than the 1026-byte buffer can cause a buffer overrun.
813 The overrunning data can be chosen by any person able to control the
814 names of filenames created on the local system. This will normally
815 include all local users, but in many cases also remote users (for
816 example in the case of FTP servers allowing uploads).
821 Findutils supports three different formats of locate database, its
822 native format "LOCATE02", the slocate variant of LOCATE02, and a
823 traditional ("old") format that locate uses on other Unix systems.
825 When locate reads filenames from a LOCATE02 database (the default
826 format), the buffer into which data is read is automatically extended
827 to accomodate the length of the filenames.
829 This automatic buffer extension does not happen for old-format
830 databases. Instead a 1026-byte buffer is used. When a longer
831 pathname appears in the locate database, the end of this buffer is
832 overrun. The buffer is allocated on the heap (not the stack).
834 If the locate database is in the default LOCATE02 format, the locate
835 program does perform automatic buffer extension, and the program is
836 not vulnerable to this problem. The software used to build the
837 old-format locate database is not itself vulnerable to the same
840 Most installations of GNU findutils do not use the old database
841 format, and so will not be vulnerable.
849 All existing releases of findutils are affected.
855 To discover the ongest path name on a given system, you can use the
856 following command (requires GNU findutils and GNU coreutils):
859 find / -print0 | tr -c '\0' 'x' | tr '\0' '\n' | wc -L
865 This section includes a shell script which determines which of a list
866 of locate binaries is vulnerable to the problem. The shell script has
867 been tested only on glibc based systems having a mktemp binary.
869 NOTE: This script deliberately overruns the buffer in order to
870 determine if a binary is affected. Therefore running it on your
871 system may have undesirable effects. We recommend that you read the
872 script before running it.
877 if vanilla_db="$(mktemp nicedb.XXXXXX)" ; then
878 if updatedb --prunepaths="" --old-format --localpaths="/tmp" \
879 --output="$@{vanilla_db@}" ; then
882 rm -f "$@{vanilla_db@}"
884 echo "Failed to create old-format locate database; skipping the sanity checks" >&2
889 # Start with a valid database
890 cat "$@{vanilla_db@}"
891 # Make the final entry really long
892 dd if=/dev/zero bs=1 count=1500 2>/dev/null | tr '\000' 'x'
899 usage() @{ echo "usage: $0 binary [binary...]" >&2; exit $1; @}
900 [ $# -eq 0 ] && usage 1
905 if dbfile="$(mktemp nasty.XXXXXX)"
907 make_overrun_db > "$dbfile"
909 ver="$locate = $("$locate" --version | head -1)"
910 if [ -z "$vanilla_db" ] || "$locate" -d "$vanilla_db" "" >/dev/null ; then
911 "$locate" -d "$dbfile" "" >/dev/null
912 if [ $? -gt 128 ] ; then
920 # the regular locate failed
922 buggy, may or may not be vulnerable: $ver"
925 rm -f "$@{dbfile@}" "$@{vanilla_db@}"
926 # good: unaffected. bad: affected (vulnerable).
927 # ugly: doesn't even work for a normal old-format database.
942 The GNU project discovered the problem while 'locate' was being worked
943 on; this is the first public announcement of the problem.
945 The GNU findutils mantainer has issued a patch as p[art of this
946 announcement. The patch appears below.
948 A source release of findutils-4.2.31 will be issued on 2007-05-30.
949 That release will of course include the patch. The patch will be
950 committed to the public CVS repository at the same time. Public
951 announcements of the release, including a description of the bug, will
952 be made at the same time as the release.
954 A release of findutils-4.3.x will follow and will also include the
961 This patch should apply to findutils-4.2.23 and later.
962 Findutils-4.2.23 was released almost two years ago.
964 Index: locate/locate.c
965 ===================================================================
966 RCS file: /cvsroot/findutils/findutils/locate/locate.c,v
967 retrieving revision 1.58.2.2
968 diff -u -p -r1.58.2.2 locate.c
969 --- locate/locate.c 22 Apr 2007 16:57:42 -0000 1.58.2.2
970 +++ locate/locate.c 28 May 2007 10:18:16 -0000
971 @@@@ -124,9 +124,9 @@@@ extern int errno;
973 #include "locatedb.h"
975 -#include "../gnulib/lib/xalloc.h"
976 -#include "../gnulib/lib/error.h"
977 -#include "../gnulib/lib/human.h"
982 #include "closeout.h"
983 #include "nextelem.h"
984 @@@@ -468,10 +468,36 @@@@ visit_justprint_unquoted(struct process_
985 return VISIT_CONTINUE;
989 +toolong (struct process_data *procdata)
992 + _("locate database %s contains a "
993 + "filename longer than locate can handle"),
998 +extend (struct process_data *procdata, size_t siz1, size_t siz2)
1000 + /* Figure out if the addition operation is safe before performing it. */
1001 + if (SIZE_MAX - siz1 < siz2)
1003 + toolong (procdata);
1005 + else if (procdata->pathsize < (siz1+siz2))
1007 + procdata->pathsize = siz1+siz2;
1008 + procdata->original_filename = x2nrealloc (procdata->original_filename,
1009 + &procdata->pathsize,
1015 visit_old_format(struct process_data *procdata, void *context)
1018 + register size_t i;
1021 /* Get the offset in the path where this path info starts. */
1022 @@@@ -479,20 +505,35 @@@@ visit_old_format(struct process_data *pr
1023 procdata->count += getw (procdata->fp) - LOCATEDB_OLD_OFFSET;
1025 procdata->count += procdata->c - LOCATEDB_OLD_OFFSET;
1026 + assert(procdata->count > 0);
1028 - /* Overlay the old path with the remainder of the new. */
1029 - for (s = procdata->original_filename + procdata->count;
1030 + /* Overlay the old path with the remainder of the new. Read
1031 + * more data until we get to the next filename.
1033 + for (i=procdata->count;
1034 (procdata->c = getc (procdata->fp)) > LOCATEDB_OLD_ESCAPE;)
1035 - if (procdata->c < 0200)
1036 - *s++ = procdata->c; /* An ordinary character. */
1039 - /* Bigram markers have the high bit set. */
1040 - procdata->c &= 0177;
1041 - *s++ = procdata->bigram1[procdata->c];
1042 - *s++ = procdata->bigram2[procdata->c];
1046 + if (procdata->c < 0200)
1048 + /* An ordinary character. */
1049 + extend (procdata, i, 1u);
1050 + procdata->original_filename[i++] = procdata->c;
1054 + /* Bigram markers have the high bit set. */
1055 + extend (procdata, i, 2u);
1056 + procdata->c &= 0177;
1057 + procdata->original_filename[i++] = procdata->bigram1[procdata->c];
1058 + procdata->original_filename[i++] = procdata->bigram2[procdata->c];
1062 + /* Consider the case where we executed the loop body zero times; we
1063 + * still need space for the terminating null byte.
1065 + extend (procdata, i, 1u);
1066 + procdata->original_filename[i] = 0;
1068 procdata->munged_filename = procdata->original_filename;
1075 Thanks to Rob Holland <rob@@inversepath.com> and Tavis Ormandy.
1078 VIII. CVE INFORMATION
1079 =====================
1081 No CVE candidate number has yet been assigned for this vulnerability.
1082 If someone provides one, I will include it in the public announcement
1086 The original announcement above was sent out with a cleartext PGP
1087 signature, of course, but that has been omitted from the example.
1089 Once a fixed release is available, announce the new release using the
1090 normal channels. Any CVE number assigned for the problem should be
1091 included in the @file{ChangeLog} and @file{NEWS} entries. See
1092 @url{http://cve.mitre.org/} for an explanation of CVE numbers.
1096 @node Making Releases
1097 @chapter Making Releases
1098 This section will explain how to make a findutils release. For the
1099 time being here is a terse description of the main steps:
1102 @item Commit changes; make sure your working directory has no
1103 uncommitted changes.
1104 @item Test; make sure that all changes you have made have tests, and
1105 that the tests pass. Verify this with @code{make distcheck}.
1106 @item Bugs; make sure all Savannah bug entries fixed in this release
1108 @item NEWS; make sure that the NEWS and configure.in file are updated
1109 with the new release number (and checked in).
1110 @item Build the release tarball; do this with @code{make distcheck}.
1111 Copy the tarball somewhere safe.
1112 @item Tag the release; findutils releases are tagged in CVS as
1113 FINDUTILS_x_y_z-1. For example, the tag for findutils release 4.3.8
1114 is FINDUTILS_4_3_8-1.
1115 @item Prepare the upload and upload it.
1116 @xref{Automated FTP Uploads, ,Automated FTP
1117 Uploads, maintain, Information for Maintainers of GNU Software},
1118 for detailed upload instructions.
1119 @item Make a release announcement; include an extract from the NEWS
1120 file which explains what's changed. Announcements for test releases
1121 should just go to @email{bug-findutils@@gnu.org}. Announcements for
1122 stable releases should go to @email{info-gnu@@gnu.org} as well.
1123 @item Bump the release numbers in CVS; edit the @file{configure.in}
1124 and @file{NEWS} files to advance the release numbers. For example,
1125 if you have just released @samp{4.6.2}, bump the release number to
1126 @samp{4.6.3-CVS}. The point of the @samp{-CVS} suffix here is that a
1127 findutils binary built from CVS will bear a release number indicating
1128 it's not built from the the ``official'' source release.
1129 @item Close bugs; any bugs recorded on Savannah which were fixed in this
1130 release should now be marked as closed. Update the @samp{Fixed
1131 Release} field of these bugs appropriately and make sure the
1132 @samp{Assigned to} field is populated.
1136 @node GNU Free Documentation License
1137 @appendix GNU Free Documentation License
1142 @comment texi related words used by Emacs' spell checker ispell.el
1144 @comment LocalWords: texinfo setfilename settitle setchapternewpage
1145 @comment LocalWords: iftex finalout ifinfo DIR titlepage vskip pt
1146 @comment LocalWords: filll dir samp dfn noindent xref pxref
1147 @comment LocalWords: var deffn texi deffnx itemx emph asis
1148 @comment LocalWords: findex smallexample subsubsection cindex
1149 @comment LocalWords: dircategory direntry itemize
1151 @comment other words used by Emacs' spell checker ispell.el
1152 @comment LocalWords: README fred updatedb xargs Plett Rendell akefile
1153 @comment LocalWords: args grep Filesystems fo foo fOo wildcards iname
1154 @comment LocalWords: ipath regex iregex expr fubar regexps
1155 @comment LocalWords: metacharacters macs sr sc inode lname ilname
1156 @comment LocalWords: sysdep noleaf ls inum xdev filesystems usr atime
1157 @comment LocalWords: ctime mtime amin cmin mmin al daystart Sladkey rm
1158 @comment LocalWords: anewer cnewer bckw rf xtype uname gname uid gid
1159 @comment LocalWords: nouser nogroup chown chgrp perm ch maxdepth
1160 @comment LocalWords: mindepth cpio src CD AFS statted stat fstype ufs
1161 @comment LocalWords: nfs tmp mfs printf fprint dils rw djm Nov lwall
1162 @comment LocalWords: POSIXLY fls fprintf strftime locale's EDT GMT AP
1163 @comment LocalWords: EST diff perl backquotes sprintf Falstad Oct cron
1164 @comment LocalWords: eg vmunix mkdir afs allexec allwrite ARG bigram
1165 @comment LocalWords: bigrams cd chmod comp crc CVS dbfile dum eof
1166 @comment LocalWords: fileserver filesystem fn frcode Ghazi Hnewc iXX
1167 @comment LocalWords: joeuser Kaveh localpaths localuser LOGNAME
1168 @comment LocalWords: Meyering mv netpaths netuser nonblank nonblanks
1169 @comment LocalWords: ois ok Pinard printindex proc procs prunefs
1170 @comment LocalWords: prunepaths pwd RFS rmadillo rmdir rsh sbins str
1171 @comment LocalWords: su Timar ubins ug unstripped vf VM Weitzel
1172 @comment LocalWords: wildcard zlogout basename execdir wholename iwholename
1173 @comment LocalWords: timestamp timestamps Solaris FreeBSD OpenBSD POSIX