From fa0f4ca450c33cf03d78d5912102c3fd511c2427 Mon Sep 17 00:00:00 2001 From: jay Date: Wed, 27 Jun 2007 22:02:47 +0000 Subject: [PATCH] Added maintenance manual --- ChangeLog | 7 + NEWS | 3 + doc/Makefile.am | 5 +- doc/find-maint.texi | 1165 +++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 1178 insertions(+), 2 deletions(-) create mode 100644 doc/find-maint.texi diff --git a/ChangeLog b/ChangeLog index 06cb1e3..0147aaa 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2007-06-27 James Youngman + + Added a maintenance manual. + * doc/find-maint.texi: New file. + * doc/Makefile.am (info_TEXINFOS): Added doc/find-main.texi. + * doc/find.texi (Introduction): Fixed typo. + 2007-06-26 Eric Blake * import-gnulib.config (modules): Allow ./configure diff --git a/NEWS b/NEWS index 4ae957b..d4c213e 100644 --- a/NEWS +++ b/NEWS @@ -17,6 +17,9 @@ several fields. Also explain that when reporting a bug, you should check the most recent findutils release first. +Introduced doc/find-maint.texi, a maintenance manual for findutils. + + * Major changes in release 4.3.8 ** Bug Fixes diff --git a/doc/Makefile.am b/doc/Makefile.am index ec70e4f..55b5724 100644 --- a/doc/Makefile.am +++ b/doc/Makefile.am @@ -1,5 +1,6 @@ -info_TEXINFOS = find.texi -find_TEXINFOS = perm.texi getdate.texi regexprops.texi +info_TEXINFOS = find.texi find-maint.texi +find_TEXINFOS = perm.texi getdate.texi regexprops.texi +find_maint_TEXINFOS = MOSTLYCLEANFILES = find.cps CLEANFILES = find.txt find_mono.html findutils.texi_html_node.tar.gz diff --git a/doc/find-maint.texi b/doc/find-maint.texi new file mode 100644 index 0000000..6f3034a --- /dev/null +++ b/doc/find-maint.texi @@ -0,0 +1,1165 @@ +\input texinfo @c -*-texinfo-*- +@c %**start of header +@setfilename find-maint.info +@settitle Maintaining Findutils +@c For double-sided printing, uncomment: +@c @setchapternewpage odd +@c %**end of header + +@include versionmaint.texi + +@iftex +@finalout +@end iftex + +@dircategory GNU organization +@direntry +* Maintaining Findutils: (find-maint). Maintaining GNU findutils +@end direntry + +@copying +This manual explains how GNU findutils is maintained, how changes should +be made and tested, and what resources exist to help developers. + +This is edition @value{EDITION}, for findutils version @value{VERSION}. + +Copyright @copyright{} 2007 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.2 +or any later version published by the Free Software Foundation; +with no Invariant Sections, with no +Front-Cover Texts, and with no Back-Cover Texts. +A copy of the license is included in the section entitled ``GNU +Free Documentation License''. +@end copying + +@titlepage +@title Maintaining Findutils +@subtitle Edition @value{EDITION}, for GNU findutils version @value{VERSION} +@subtitle @value{UPDATED} +@author by James Youngman + +@page +@vskip 0pt plus 1filll +@insertcopying{} +@end titlepage + +@contents + +@ifnottex +@node Top, Introduction, (dir), (dir) +@top Maintaining GNU Findutils + +@insertcopying +@end ifnottex + +@menu +* Introduction:: +* Maintaining GNU Programs:: +* Design Issues:: +* Coding Conventions:: +* Tools:: +* Using the GNU Portability Library:: +* Documentation:: +* Testing:: +* Bugs:: +* Distributions:: +* Internationalisation:: +* Security:: +* Making Releases:: +@end menu + + + + + +@node Introduction +@chapter Introduction + +This document explains how to contribute to and maintain GNU +Findutils. It concentrates on developer-specific issues. For +information about how to use the software please refer to +@xref{Introduction, ,Introduction,find,The Findutils manual}. + +This manual aims to be useful without necessarily being verbose. It's +also a recent document, so there will be a many areas in which +improvements can be made. If you find that the document misses out +important information or any part of the document is be so terse as to +be unuseful, please ask for help on the @email{bug-findutils@@gnu.org} +mailing list. We'll try to improve this document too. + + +@node Maintaining GNU Programs +@chapter Maintaining GNU Programs + +GNU Findutils is part of the GNU Project and so there are a number of +documents which set out standards for the maintenance of GNU +software. + +@table @file +@item standards.texi +GNU Project Coding Standards. All changes to findutils should comply +with these standards. In some areas we go somewhat beyond the +requirements of the standards, but these cases are explained in this +manual. +@item maintain.texi +Information for Maintainers of GNU Software. This document provides +guidance for GNU maintainers. Everybody with commit access should +read this document. Everybody else is welcome to do so too, of +course. +@end table + + + +@node Design Issues +@chapter Design Issues + +The findutils package is installed on many many systems, usually as a +fundamental component. The programs in the package are often used in +order to successfully boot or fix the system. + +This fact means that for findutils we bear in mind considerations that +may not apply so much as for other packages. For example, the fact +that findutils is often a base component motivates us to +@itemize +@item Limit dependencies on libraries +@item Avoid dependencies on other large packages (for example, interpreters) +@item Be conservative when making changes to the 'stable' release branch +@end itemize + +All those considerations come before functionality. Functional +enhancements are still made to findutils, but these are almost +exclusively introduced in the 'development' release branch, to allow +extensive testing and proving. + +Sometimes it is useful to have a priority list to provide guidance +when making design trade-offs. For findutils, that priority list is: + +@enumerate +@item Correctness +@item Standards compliance +@item Security +@item Backward compatibility +@item Performance +@item Functionality +@end enumerate + +For example, we support the @code{-exec} action because POSIX +compliance requires this, even though there are security problems with +it and we would otherwise prefer people to use @code{-execdir}. There +are also cases where some performance is sacrificed in the name of +security. For example, the sanity checks that @code{find} performs +while traversing a directory tree may slow it down. We adopt +functional changes, and functional changes are allowed to make +@code{find} slower, but only if there is no detectable impact on users +who don't use the feature. + +Backward-incompatible changes do get made in order to comply with +standards (for example the behaviour of @code{-perm -...} changed in +order to comply with POSIX). However, they don't get made in order to +provide better ease of use; for example the semantics of @code{-size +-2G} are almost always unexpected by users, but we retain the current +behaviour because of backward compatibility and for its similarity to +the block-rounding behaviour of @code{-size -30}. We might introduce +a change which does not have the unfortunate rounding behaviour, but +we would choose another syntax (for example @code{-size '<2G'}) for +this. + +In a general sense, we try to do test-driven development of the +findutils code; that is, we try to implement test cases for new +features and bug fixes before modifying the code to make the test +pass. Some features of the code are tested well, but the test +coverage for other features is less good. If you are about to modify +the code for a predicate and aren't sure about the test coverage, use +@code{grep} on the test directories and measure the coverage with +@code{gcov} or another test coverage tool. + +Lastly, we try not to depend on having a ``working system''. The +findutils suite is used for diagnosis of problems, and this applies +especially to @code{find}. We should ensure that @code{find} still +works on relatively broken systems, for example systems with damaged +@file{/etc/passwd} files. Another interesting example is the case +where a system is a client of one or more unresponsive NFS servers. +On such a system, if you try to stat all mount points, your program +will hang indefinitely, waiting for the remote NFS server to respond. + + + +@c Installed on many systems +@c Often part of base +@c Needs to work on broken systems (e.g. unresponsive NFS servers, +@c mode-0 files) + +@node Coding Conventions +@chapter Coding Conventions + +Coding style documents which set out to establish a uniform look and +feel to source code have worthy goals, for example greater ease of +maintenance and readability. However, I do not believe that in +general coding style guide authors can envisage every situation, and +it is always possible that it might on occasion be necessary to break +the letter of the style guide in order to honour its spirit, or to +better achieve the style guide's goals. + +I've certainly seen many style guides outside the free software world +which make bald statements such as ``functions shall have exactly one +return statement''. The desire to ensure consistency and obviousness +of control flow is laudable, but it is all too common for such bald +requirements to be followed unthinkingly. Certainly I've seen such +coding standards result in unmaintainable code with terrible +infelicities such as functions containing @code{if} statements nested +nine levels deep. I suppose such coding standards don't survive in +free software projects because they tend to drive away potential +contributors or tend to generate heated discussions on mailing lists. +Equally, a nine-level-deep function in a free software program would +quickly get refactored, assuming it is obvious what the function is +supposed to do... + +Be that as it may, the approach I will take for this document is to +explain some idioms and practices in use in the findutils source code, +and leave it up to the reader's engineering judgement to decide which +considerations apply to the code they are working on, and whether or +not there is sufficient reason to ignore the guidance in current +circumstances. + + +@menu +* Make the Compiler Find the Bugs:: +* The File System Is Being Modified:: +* Don't Trust the File System Contents:: +* Debugging is For Users Too:: +* Factor Out Repeated Code:: +@end menu + +@node Make the Compiler Find the Bugs +@section Make the Compiler Find the Bugs + +Finding bugs is tedious. If I have a filesystem containing two +million files, and a find command line should print one million of +them, but in fact it misses out 1%, you can tell the program is +printing the wrong result only if you know the right answer for that +filesystem at that time. If you don't know this, you may just not +find out about that bug. For this reason it is important to have a +comprehensive test suite. + +The test suite is of course not the only way to find the bugs. The +findutils source code makes liberal use of the assert macro. While on +the one hand these might be a performance drain, the performance +impact of most of these is negligible compared to the time taken to +fetch even one sector from a disk drive. + +Assertions should not be used to check the results of operations which +may be affected by the program's external environment. For example, +never assert that a file could be opened successfully. Errors +relating to problems with the program's execution environment should +be diagnosed with a user-oriented error message. An assertion failure +should always denote a bug in the program. + +Several programs in the findutils suite perform self-checks. See for +example the function @code{pred_sanity_check} in @file{find/pred.c}. +This is generally desirable. + +There are also a number of small ways in which we can help the +compiler to find the bugs for us. + +@subsection Constants in Equality Testing + +It's a common error to write @code{=} when @code{==} is meant. +Sometimes this happens in new code and is simply due to finger +trouble. Sometimes it is the result of the inadvertent deletion of a +character. In any case, there is a subset of cases where we can +persuade the compiler to generate an error message when we make this +mistake; this is where the equality test is with a constant. + +This is an example of a vulnerable piece of code. + +@example +if (x == 2) + ... +@end example + +A simple typo converts the above into + +@example +if (x = 2) + ... +@end example + +We've introduced a bug; the condition is always true, and the value of +@code{x} has been changed. However, a simple change to our practice +would have made us immune to this problem: + +@example +if (2 == x) + ... +@end example + +Usually, the Emacs keystroke @kbd{M-t} can be used to swap the operands. + + +@subsection Spelling of ASCII NUL + +Strings in C are just sequences of characters terminated by a NUL. +The ASCII NUL character has the numerical value zero. It is normally +represented in C code as @samp{\0}. Here is a typical piece of C +code: + +@example +*p = '\0'; +@end example + +Consider what happens if there is an unfortunate typo: + +@example +*p = '0'; +@end example + +We have changed the meaning of our program and the compiler cannot +diagnose this as an error. Our string is no longer terminated. Bad +things will probably happen. It would be better if the compiler could +help us diagnose this problem. + +In C, the type of @code{'\0'} is in fact int, not char. This provides +us with a simple way to avoid this error. The constant @code{0} has +the same value and type as the constant @code{'\0'}. However, it is +not as vulnerable to typos. For this reason I normally prefer to +use this code: + +@example +*p = 0; +@end example + + +@node Factor Out Repeated Code +@section Factor Out Repeated Code + +Repeated code imposes a greater maintenance burden and increases the +exposure to bugs. For example, if you discover that something you +want to implement has some similarity with an existing piece of code, +don't cut and paste it. Instead, factor the code out. The risk of +cutting and pasting the code, particularly if you do this several +times, is that you end up with several copies of the same code. + +If the original code had a bug, you now have N places where this needs +to be fixed. It's all to easy to miss some out when trying to fix the +bug. Equally, it's quite possible that when pasting the code into +some function, the pasted code was not quite adapted correctly to its +new environment. To pick a contrived example, perhaps it modifies a +global variable which it that code shouldn't be touching in its new +home. Worse, perhaps it makes some unstated assumption about the +nature of the input arguments which is in fact not true for the +context of the now duplicated code. + +A good example of the use of refactoring in findutils is the +@code{collect_arg} function in @file{find/parser.c}. A less clear-cut +but larger example is the factoring out of code which would otherwise +have been duplicated between @file{find/find.c} and +@code{find/ftsfind.c}. + +The findutils test suite is comprehensive enough that refactoring code +should not generally be a daunting prospect from a testing point of +view. Nevertheless there are some areas which are only +lightly-tested: + +@enumerate +@item Tests on the ages of files +@item Code which deals with the values returned by operating system calls (for example handling of ENOENT) +@item Code dealing with OS limits (for example, limits on path length +or exec arguments) +@item Code relating to features not all systems have (for example +Solaris Doors) +@end enumerate + +Please exercise caution when working in those areas. + + +@node Debugging is For Users Too +@section Debugging is For Users Too + +Debug and diagnostic code is often used to verify that a program is +working in the way its author thinks it should be. But users are +often uncertain about what a program is doing, too. Exposing them a +little more diagnostic information can help. Much of the diagnostic +code in @code{find}, for example, is controlled by the @samp{-D} flag, +as opposed to C preprocessor directives. + +Making diagnostic messages available to users also means that the +phrasing of the diagnostic messages becomes important, too. + + +@node Don't Trust the File System Contents +@section Don't Trust the File System Contents + +People use @code{find} to search in directories created by other +people. Sometimes they do this to check to suspicious activity (for +example to look for new setuid binaries). This means that it would be +bad if @code{find} were vulnerable to, say, a security problem +exploitable by constructing a specially-crafted filename. The same +consideration would apply to @code{locate} and @code{updatedb}. + +Henry Spencer said this well in his fifth commandment: +@quotation +Thou shalt check the array bounds of all strings (indeed, all arrays), +for surely where thou typest @samp{foo} someone someday shall type +@samp{supercalifragilisticexpialidocious}. +@end quotation + +Symbolic links can often be a problem. If @code{find} calls +@code{lstat} on something and discovers that it is a directory, it's +normal for @code{find} to recurse into it. Even if the @code{chdir} +system call is used immediately, there is still a window of +opportunity between the @code{lstat} and the @code{chdir} in which a +malicious person could rename the directory and substitute a symbolic +link to some other directory. + +@node The File System Is Being Modified +@section The File System Is Being Modified + +The filesystem gets modified while you are traversing it. For, +example, it's normal for files to get deleted while @code{find} is +traversing a directory. Issuing an error message seems helpful when a +file is deleted from the one directory you are interested in, but if +@code{find} is searching 15000 directories, such a message becomes +less helpful. + +Bear in mind also that it is possible for the directory @code{find} is +currently searching could be moved to another point in the filesystem, +and that the directory in which @code{find} was started could be +deleted. + +Henry Spencer's sixth commandment is also apposite here: +@quotation +If a function be advertised to return an error code in the event of +difficulties, thou shalt check for that code, yea, even though the +checks triple the size of thy code and produce aches in thy typing +fingers, for if thou thinkest ``it cannot happen to me'', the gods +shall surely punish thee for thy arrogance. +@end quotation + +There are a lot of files out there. They come in all dates and +sizes. There is a condition out there in the real world to exercise +every bit of the code base. So we try to test that code base before +someone falls over a bug. + + +@node Tools +@chapter Tools +Most of the tools required to build findutils are mentioned in the +file @file{README-CVS}. We also use some other tools: + +@table @asis +@item System call traces +Much of the execution time of find is spent waiting for filesystem +operations. A system call trace (for example, that provided by +@code{strace}) shows what system calls are being made. Using this +information we can work to remove unnecessary file system operations. + +@item Valgrind +Valgrind is a tool which dynamically verifies the memory accesses a +program makes to ensure that they are valid (for example, that the +behaviour of the program does not in any way depend on the contents of +uninitialised memory). + +@item DejaGnu +DejaGnu is the test framework used to run the findutils test suite +(the @code{runtest} program is part of DejaGnu). It would be ideal if +everybody building @code{findutils} also ran the test suite, but many +people don't have DejaGnu installed. When changes are made to +findutils, DejaGnu is invoked a lot. @xref{Testing}, for more +information. +@end table + +@node Using the GNU Portability Library +@chapter Using the GNU Portability Library +The Gnulib library (@url{http://www.gnu.org/software/gnulib/}) makes a +variety of systems look more like a GNU/Linux system and also applies +a bunch of automatic bug fixes and workarounds. Some of these also +apply to GNU/Linux systems too. For example, the Gnulib regex +implementation is used when we determine that we are building on a +GNU libc system with a bug in the regex implementation. + + +@section How and Why we Import the Gnulib Code +Gnulib does not have a release process which results in a source +tarball you can download. Instead, the code is simply made available +by CVS. + +GNU projects vary in how they interact with Gnulib. Many import a +selection of code from Gnulib into the working directory and then +check the updated files into the CVS repository for their project. +The coreutils project does this, for example. + +At the last maintainer changeover for findutils (2003) it turned out +that there was a lot of material in findutils in common with Gnulib, +but it had not been updated in a long time. It was difficult to +figure out which source files were intended to track external sources +and which were intended to contain incompatible changes, or diverge +for other reasons. + +To reduce this uncertainty, I decided to treat Gnulib much like +Automake. Files supplied by Automake are simply absent from the +findutils source tree. When Automake is run with @code{automake +--add-missing --copy}, it adds in all the files it thinks should be +there which aren't there already. + +An analogous approach is taken with Gnulib. The Gnulib code is +imported from the CVS repository for Gnulib with a findutils helper +script, @code{import-gnulib.sh}. That script fetches a copy of the +Gnulib code into the subdirectory @file{gnulib-cvs} and then runs +@code{gnulib-tool}. The @code{gnulib-tool} program copies the +required parts of Gnulib into the findutils source tree in the +subdirectory @file{gnulib}. This process gives us the property that +the code in @file{gnulib} and @code{gnulib-cvs} is not included in the +findutils CVS tree. Both directories are listed in @file{.cvsignore} +and so CVS ignores them. + +Findutils does not use all the Gnulib code. The modules we need are +listed in the file @file{import-gnulib.config}. The same file also +indicates the version of Gnulib that we want to use. Since Gnulib has +no actual release process, we just use a date. Both +@file{import-gnulib.sh} and @file{import-gnulib.config} are in the +findutils CVS repository. + +The upshot of all this is that we can use the findutils CVS repository +to track which version of Gnulib every findutils release uses. That +information is also provided when the user invokes a findutils program +with the @samp{--version} option. It also means that if a file exists +in the Findutils CVS repository, you can be certain that the file +exists in the CVS repository and is different from a similar file +elsewhere, it's for a reason. + +There are a small number of exceptions to this; the standard +boiler-plate GNU files such as @file{ABOUT-NLS}, @file{INSTALL} and +@file{COPYING}. + + +@section How We Fix Gnulib Bugs +If we always directly import the Gnulib code directly from the CVS +repository in this way, it is impossible to maintain a locally +different copy of Gnulib. This is often a benefit in that accidental +version skew is prevented. + +However, sometimes we want deliberate version skew in order to use a +findutils-specific patched version of a Gnulib file, for example +because we fixed a bug. + +Gnulib is used by quite a number of GNU projects, and this means that +it gets plenty of testing. Therefore there are relatively few bugs in +the Gnulib code, but it does happen from time to time. + +However, since there is no waiting around for a Gnulib source release +tarball, Gnulib bugs are generally fixed quickly. Here is an outline +of the way we would contribute a fix to Gnulib (assuming you know it +is not already fixed in current Gnulib CVS): + +@table @asis +@item Check you already completed a copyright assignment for Gnulib +@item Begin with a vanilla CVS tree +Download the Findutils source code from CVS (or use the tree you have +already) +@item Check out a copy of the Gnulib source +An easy way to do this is to simply use @code{cp -ar} on the +@file{gnulib-cvs} directory. Have the Gnulib code checked out +somewhere @emph{outside} your working CVS tree for findutils. +@item Import Gnulib from your local copy +The @code{import-gnulib.sh} tool has a @samp{-d} option which you can +use to import the code from a local copy of Gnulib. +@item Build findutils +Build findutils and run the test suite, which should pass. In our +example we assume you have just noticed a bug in Gnulib, not that +recent Gnulib changes broke the findutils regression tests. +@item Write a test case +If in fact Gnulib did break the findutils regression tests, you can probably +skip this step, since you already have a test case demonstrating the problem. +Otherwise, write a findutils test case for the bug and/or a Gnulib test case. +@item Fix the Gnulib bug +Make sure your editor follows symbolic links so that your changes to +@file{gnulib/...} actually affect the files in the CVS working +directory you checked out earlier. Observe that your test now passes. +@item Prepare a Gnulib patch +Use @code{cvs -z3 diff -upN} to prepare the patch. Write a ChangeLog +entry and prepend this to the patch. Check that the patch conforms +with the GNU coding standards, and email it to the Gnulib mailing +list. +@item Wait for the patch to be applied +Once your bug fix has been applied, you can update your local directory +from CVS, re-import the code into Findutils (still using the @code{-d} +option), and re-run the tests. This verifies that the fix the Gnulib +team made actually fixes your problem. +@item Reimport the Gnulib code +Update the findutils file @file{import-gnulib.config} to specify a +date which is after the point at which the bug fix was committed to +Gnulib. Finally, re-import the Gnulib code directly from CVS by using +@samp{import-gnulib.sh} without the @samp{-d} option, and run the +tests again. This verifies that there was no remaining local change +that we were relying on to fix the bug. + +Be aware of the fact that the date specified in the +@file{import-gnulib.config} file selects the latest changes for the +given date, so if you modify @file{import-gnulib.config} as soon as +someone tells you they they checked in a bugfix and you set +@var{gnulib_version} to today's date, there will be some file version +instability for the rest of the day. + +@end table + +@node Documentation +@chapter Documentation + +The findutils CVS tree includes several different types of +documentation. + +@section User Documentation +User-oriented documentation is provided as manual pages and in +Texinfo. See +@ref{Introduction,,Introduction,find,The Findutils manual}. + +Please make sure both sets of documentation are updated if you make a +change to the code. The GNU coding standards do not normally call for +maintaining manual pages on the grounds of effort duplication. +However, the manual page format is more convenient for quick +reference, and so it's worth maintaining both types of documentation. +However, the manual pages are normally rather more terse than the +Texinfo documentation. The manual pages are suitable for reference +use, but the Texinfo manual should also include introductory and +tutorial material. + + +@section Build Guidance + +@table @file +@item ABOUT-NLS +Describes the Free Translation Project, the translation status of +various GNU projects, and how to participate by translating an +application. +@item AUTHORS +Lists the authors of findutils. +@item COPYING +The copyright license covering findutils; currently, the GNU GPL, +version 2. +@item INSTALL +Generic installation instructions for installing GNU programs. +@item README +Information about how to compile findutils in particular +@item README-alpha +A README file which is included with testing releases of findutils. +@item README-CVS +Describes how to build findutils from the code in CVS. +@item THANKS +Thanks for people who contributed to findutils. Generally, if +someone's contribution was significant enough to need a copyright +assignment, their name should go in here. +@item TODO +Mainly obsolete. +@end table + + +@section Release Information +@table @file +@item NEWS +Enumerates the user-visible change in each release. Typical changes +are fixed bugs, functionality changes and documentation changes. +@item ChangeLog +This file enumerates all changes to the findutils source code (with +the possible exception of @file{.cvsignore} and @code{.gitignore} +changes). The level of detail used for this file should be sufficient +to answer the questions ``what changed?'' and ``why was it changed?''. +If a change fixes a bug, always give the bug reference number in both +the @file{ChangeLog} and @file{NEWS} files and of course also in the +checkin message. In general, it should be possible to enumerate all +material changes to a function by searching for its name in +@file{ChangeLog}. +@end table + +@node Testing +@chapter Testing +This chapter will explain the general procedures for adding tests to +the test suite, and the functions defined in the findutils-specific +DejaGnu configuration. Where appropriate references will be made to +the DejaGnu documentation. + +@node Bugs +@chapter Bugs + +Bugs are logged in the Savannah bug tracker +@url{http://savannah.gnu.org/bugs/?group=findutils}. The tracker +offers several fields but their use is largely obvious. The +life-cycle of a bug is like this: + + +@table @asis +@item Open +Someone, usually a maintainer, a distribution maintainer or a user, +creates a bug by filling in the form. They fill in field values as +they see fit. This will generate an email to +@email{bug-findutils@@gnu.org}. + +@item Triage +The bug hangs around with @samp{Status=None} until someone begins to +work on it. At that point they set the ``Assigned To'' field and will +sometimes set the status to @samp{In Progress}, especially if the bug +will take a while to fix. + +@item Non-bugs +Quite a lot of reports are not actually bugs; for these the usual +procedure is to explain why the problem is not a bug, set the status +to @samp{Invalid} and close the bug. Make sure you set the +@samp{Assigned to} field to yourself before closing the bug. + +@item Fixing +When you commit a bug fix into CVS (or in the case of a contributed +patch, commit the change), mark the bug as @samp{Fixed}. Make sure +you include a new test case where this is relevant. If you can figure +out which releases are affected, please also set the @samp{Release} +field to the earliest release which is affected by the bug. +Indicate which source branch the fix is included in (for example, +4.2.x or 4.3.x). Don't close the bug yet. + +@item Release +When a release is made which includes the bug fix, make sure the bug +is listed in the NEWS file. Once the release is made, fill in the +@samp{Fixed Release} field and close the bug. +@end table + + +@node Distributions +@chapter Distributions +Almost all GNU/Linux distributions include findutils, but only some of +them have a package maintainer who is a member of the mailing list. +Distributions don't often feed back patches to the +@email{bug-findutils@@gnu.org} list, but on the other hand many of +their patches relate only to standards for file locations and so +forth, and are therefore distribution specific. On an irregular basis +I check the current patches being used by one or two distributions, +but the total number of GNU/Linux distributions is large enough that +we could not hope to cover them all. + +Often, bugs are raised against a distribution's bug tracker instead of +GNU's. Periodically (about every six months) I take a look at some +of the more accessible bug trackers to indicate which bugs have been +fixed upstream. + +Many distributions include both findutils and the slocate package, +which provides a replacement @code{locate}. + + +@node Internationalisation +@chapter Internationalisation +Translation is essentially automated from the maintainer's point of +view. The TP mails the maintainer when a new PO file is available, +and we just download it and check it in. We copy the @file{.po} files +into the CVS repository. For more information, please see +@url{http://www.iro.umontreal.ca/translation/HTML/domain-findutils.html}. + + +@node Security +@chapter Security + +See @ref{Security Considerations, ,Security Considerations,find,The +Findutils manual}, for a full description of the findutils approach to +security considerations and discussion of particular tools. + +If someone reports a security bug publicly, we should fix this as +rapidly as possible. If necessary, this can mean issuing a fixed +release containing just the one bug fix. We try to avoid issuing +releases which include both significant security fixes and functional +changes. + +Where someone reports a security problem privately, we generally try +to construct and test a patch without checking the intermediate code +in. Once everything has been tested, this allows us to commit a patch +and immediately make a release. The advantage of doing things this +way is that we avoid situations where people watching for CVS commits +can figure out and exploit a security problem before a fixed release +is available. + +It's important that security problems be fixed promptly, but don't +rush so much that things go wrong. Make sure the new release really +fixes the problem. It's usually best not to include functional +changes in your security-fix release. + +If the security problem is serious, send an alert to +@email{vendor-sec@@lst.de}. The members of the list include most +GNU/Linux distributions. The point of doing this is to allow them to +prepare to release your security fix to their customers, once the fix +becomes available. Here is an example alert:- + +@smallexample +GNU findutils heap buffer overrun (potential privilege escalation) + +$Revision: 1.1 $; $Date: 2007/06/27 22:02:47 $ + + +I. BACKGROUND +============= + +GNU findutils is a set of programs which search for files on Unix-like +systems. It is maintained by the GNU Project of the Free Software +Foundation. For more information, see +@url{http://www.gnu.org/software/findutils}. + + +II. DESCRIPTION +=============== + +When GNU locate reads filenames from an old-format locate database, +they are read into a fixed-length buffer allocated on the heap. +Filenames longer than the 1026-byte buffer can cause a buffer overrun. +The overrunning data can be chosen by any person able to control the +names of filenames created on the local system. This will normally +include all local users, but in many cases also remote users (for +example in the case of FTP servers allowing uploads). + +III. ANALYSIS +============= + +Findutils supports three different formats of locate database, its +native format "LOCATE02", the slocate variant of LOCATE02, and a +traditional ("old") format that locate uses on other Unix systems. + +When locate reads filenames from a LOCATE02 database (the default +format), the buffer into which data is read is automatically extended +to accomodate the length of the filenames. + +This automatic buffer extension does not happen for old-format +databases. Instead a 1026-byte buffer is used. When a longer +pathname appears in the locate database, the end of this buffer is +overrun. The buffer is allocated on the heap (not the stack). + +If the locate database is in the default LOCATE02 format, the locate +program does perform automatic buffer extension, and the program is +not vulnerable to this problem. The software used to build the +old-format locate database is not itself vulnerable to the same +attack. + +Most installations of GNU findutils do not use the old database +format, and so will not be vulnerable. + + +IV. DETECTION +============= + +Software +-------- +All existing releases of findutils are affected. + + +Installations +------------- + +To discover the ongest path name on a given system, you can use the +following command (requires GNU findutils and GNU coreutils): + +@verbatim +find / -print0 | tr -c '\0' 'x' | tr '\0' '\n' | wc -L +@end verbatim + +V. EXAMPLE +========== + +This section includes a shell script which determines which of a list +of locate binaries is vulnerable to the problem. The shell script has +been tested only on glibc based systems having a mktemp binary. + +NOTE: This script deliberately overruns the buffer in order to +determine if a binary is affected. Therefore running it on your +system may have undesirable effects. We recommend that you read the +script before running it. + +@verbatim +#! /bin/sh +set +m +if vanilla_db="$(mktemp nicedb.XXXXXX)" ; then + if updatedb --prunepaths="" --old-format --localpaths="/tmp" \ + --output="$@{vanilla_db@}" ; then + true + else + rm -f "$@{vanilla_db@}" + vanilla_db="" + echo "Failed to create old-format locate database; skipping the sanity checks" >&2 + fi +fi + +make_overrun_db() @{ + # Start with a valid database + cat "$@{vanilla_db@}" + # Make the final entry really long + dd if=/dev/zero bs=1 count=1500 2>/dev/null | tr '\000' 'x' +@} + + + +ulimit -c 0 + +usage() @{ echo "usage: $0 binary [binary...]" >&2; exit $1; @} +[ $# -eq 0 ] && usage 1 + +bad="" +good="" +ugly="" +if dbfile="$(mktemp nasty.XXXXXX)" +then + make_overrun_db > "$dbfile" + for locate ; do + ver="$locate = $("$locate" --version | head -1)" + if [ -z "$vanilla_db" ] || "$locate" -d "$vanilla_db" "" >/dev/null ; then + "$locate" -d "$dbfile" "" >/dev/null + if [ $? -gt 128 ] ; then + bad="$bad +vulnerable: $ver" + else + good="$good +good: $ver" + fi + else + # the regular locate failed + ugly="$ugly +buggy, may or may not be vulnerable: $ver" + fi + done + rm -f "$@{dbfile@}" "$@{vanilla_db@}" + # good: unaffected. bad: affected (vulnerable). + # ugly: doesn't even work for a normal old-format database. + echo "$good" + echo "$bad" + echo "$ugly" +else + exit 1 +fi +@end verbatim + + + + +VI. VENDOR RESPONSE +=================== + +The GNU project discovered the problem while 'locate' was being worked +on; this is the first public announcement of the problem. + +The GNU findutils mantainer has issued a patch as p[art of this +announcement. The patch appears below. + +A source release of findutils-4.2.31 will be issued on 2007-05-30. +That release will of course include the patch. The patch will be +committed to the public CVS repository at the same time. Public +announcements of the release, including a description of the bug, will +be made at the same time as the release. + +A release of findutils-4.3.x will follow and will also include the +patch. + + +VII. PATCH +========== + +This patch should apply to findutils-4.2.23 and later. +Findutils-4.2.23 was released almost two years ago. +@verbatim +Index: locate/locate.c +=================================================================== +RCS file: /cvsroot/findutils/findutils/locate/locate.c,v +retrieving revision 1.58.2.2 +diff -u -p -r1.58.2.2 locate.c +--- locate/locate.c 22 Apr 2007 16:57:42 -0000 1.58.2.2 ++++ locate/locate.c 28 May 2007 10:18:16 -0000 +@@@@ -124,9 +124,9 @@@@ extern int errno; + + #include "locatedb.h" + #include +-#include "../gnulib/lib/xalloc.h" +-#include "../gnulib/lib/error.h" +-#include "../gnulib/lib/human.h" ++#include "xalloc.h" ++#include "error.h" ++#include "human.h" + #include "dirname.h" + #include "closeout.h" + #include "nextelem.h" +@@@@ -468,10 +468,36 @@@@ visit_justprint_unquoted(struct process_ + return VISIT_CONTINUE; + @} + ++static void ++toolong (struct process_data *procdata) ++@{ ++ error (1, 0, ++ _("locate database %s contains a " ++ "filename longer than locate can handle"), ++ procdata->dbfile); ++@} ++ ++static void ++extend (struct process_data *procdata, size_t siz1, size_t siz2) ++@{ ++ /* Figure out if the addition operation is safe before performing it. */ ++ if (SIZE_MAX - siz1 < siz2) ++ @{ ++ toolong (procdata); ++ @} ++ else if (procdata->pathsize < (siz1+siz2)) ++ @{ ++ procdata->pathsize = siz1+siz2; ++ procdata->original_filename = x2nrealloc (procdata->original_filename, ++ &procdata->pathsize, ++ 1); ++ @} ++@} ++ + static int + visit_old_format(struct process_data *procdata, void *context) + @{ +- register char *s; ++ register size_t i; + (void) context; + + /* Get the offset in the path where this path info starts. */ +@@@@ -479,20 +505,35 @@@@ visit_old_format(struct process_data *pr + procdata->count += getw (procdata->fp) - LOCATEDB_OLD_OFFSET; + else + procdata->count += procdata->c - LOCATEDB_OLD_OFFSET; ++ assert(procdata->count > 0); + +- /* Overlay the old path with the remainder of the new. */ +- for (s = procdata->original_filename + procdata->count; ++ /* Overlay the old path with the remainder of the new. Read ++ * more data until we get to the next filename. ++ */ ++ for (i=procdata->count; + (procdata->c = getc (procdata->fp)) > LOCATEDB_OLD_ESCAPE;) +- if (procdata->c < 0200) +- *s++ = procdata->c; /* An ordinary character. */ +- else +- @{ +- /* Bigram markers have the high bit set. */ +- procdata->c &= 0177; +- *s++ = procdata->bigram1[procdata->c]; +- *s++ = procdata->bigram2[procdata->c]; +- @} +- *s-- = '\0'; ++ @{ ++ if (procdata->c < 0200) ++ @{ ++ /* An ordinary character. */ ++ extend (procdata, i, 1u); ++ procdata->original_filename[i++] = procdata->c; ++ @} ++ else ++ @{ ++ /* Bigram markers have the high bit set. */ ++ extend (procdata, i, 2u); ++ procdata->c &= 0177; ++ procdata->original_filename[i++] = procdata->bigram1[procdata->c]; ++ procdata->original_filename[i++] = procdata->bigram2[procdata->c]; ++ @} ++ @} ++ ++ /* Consider the case where we executed the loop body zero times; we ++ * still need space for the terminating null byte. ++ */ ++ extend (procdata, i, 1u); ++ procdata->original_filename[i] = 0; + + procdata->munged_filename = procdata->original_filename; +@end verbatim + + +VIII. THANKS +============ + +Thanks to Rob Holland and Tavis Ormandy. + + +VIII. CVE INFORMATION +===================== + +No CVE candidate number has yet been assigned for this vulnerability. +If someone provides one, I will include it in the public announcement +and change logs. +@end smallexample + +The original announcement above was sent out with a cleartext PGP +signature, of course, but that has been omitted from the example. + +Once a fixed release is available, announce the new release using the +normal channels. Any CVE number assigned for the problem should be +included in the @file{ChangeLog} and @file{NEWS} entries. See +@url{http://cve.mitre.org/} for an explanation of CVE numbers. + + + +@node Making Releases +@chapter Making Releases +This section will explain how to make a findutils release. For the +time being here is a terse description of the main steps: + +@enumerate +@item Commit changes; make sure your working directory has no +uncommitted changes. +@item Test; make sure that all changes you have made have tests, and +that the tests pass. Verify this with @code{make distcheck}. +@item Bugs; make sure all Savannah bug entries fixed in this release +are fixed. +@item NEWS; make sure that the NEWS and configure.in file are updated +with the new release number (and checked in). +@item Build the release tarball; do this with @code{make distcheck}. +Copy the tarball somewhere safe. +@item Tag the release; findutils releases are tagged in CVS as +FINDUTILS_x_y_z-1. For example, the tag for findutils release 4.3.8 +is FINDUTILS_4_3_8-1. +@item Prepare the upload and upload it. +@xref{Automated FTP Uploads, ,Automated FTP +Uploads, maintain, Information for Maintainers of GNU Software}, +for detailed upload instructions. +@item Make a release announcement; include an extract from the NEWS +file which explains what's changed. Announcements for test releases +should just go to @email{bug-findutils@@gnu.org}. Announcements for +stable releases should go to @email{info-gnu@@gnu.org} as well. +@item Bump the release numbers in CVS; edit the @file{configure.in} +and @file{NEWS} files to advance the release numbers. For example, +if you have just released @samp{4.6.2}, bump the release number to +@samp{4.6.3-CVS}. The point of the @samp{-CVS} suffix here is that a +findutils binary built from CVS will bear a release number indicating +it's not built from the the ``official'' source release. +@item Close bugs; any bugs recorded on Savannah which were fixed in this +release should now be marked as closed. Update the @samp{Fixed +Release} field of these bugs appropriately and make sure the +@samp{Assigned to} field is populated. +@end enumerate + + +@bye + +@comment texi related words used by Emacs' spell checker ispell.el + +@comment LocalWords: texinfo setfilename settitle setchapternewpage +@comment LocalWords: iftex finalout ifinfo DIR titlepage vskip pt +@comment LocalWords: filll dir samp dfn noindent xref pxref +@comment LocalWords: var deffn texi deffnx itemx emph asis +@comment LocalWords: findex smallexample subsubsection cindex +@comment LocalWords: dircategory direntry itemize + +@comment other words used by Emacs' spell checker ispell.el +@comment LocalWords: README fred updatedb xargs Plett Rendell akefile +@comment LocalWords: args grep Filesystems fo foo fOo wildcards iname +@comment LocalWords: ipath regex iregex expr fubar regexps +@comment LocalWords: metacharacters macs sr sc inode lname ilname +@comment LocalWords: sysdep noleaf ls inum xdev filesystems usr atime +@comment LocalWords: ctime mtime amin cmin mmin al daystart Sladkey rm +@comment LocalWords: anewer cnewer bckw rf xtype uname gname uid gid +@comment LocalWords: nouser nogroup chown chgrp perm ch maxdepth +@comment LocalWords: mindepth cpio src CD AFS statted stat fstype ufs +@comment LocalWords: nfs tmp mfs printf fprint dils rw djm Nov lwall +@comment LocalWords: POSIXLY fls fprintf strftime locale's EDT GMT AP +@comment LocalWords: EST diff perl backquotes sprintf Falstad Oct cron +@comment LocalWords: eg vmunix mkdir afs allexec allwrite ARG bigram +@comment LocalWords: bigrams cd chmod comp crc CVS dbfile dum eof +@comment LocalWords: fileserver filesystem fn frcode Ghazi Hnewc iXX +@comment LocalWords: joeuser Kaveh localpaths localuser LOGNAME +@comment LocalWords: Meyering mv netpaths netuser nonblank nonblanks +@comment LocalWords: ois ok Pinard printindex proc procs prunefs +@comment LocalWords: prunepaths pwd RFS rmadillo rmdir rsh sbins str +@comment LocalWords: su Timar ubins ug unstripped vf VM Weitzel +@comment LocalWords: wildcard zlogout basename execdir wholename iwholename +@comment LocalWords: timestamp timestamps Solaris FreeBSD OpenBSD POSIX -- 2.11.4.GIT