Add an API to affect the choice of RCS parser.
[cvs2svn.git] / www / features.html
blob89eeb38dc56f0a244dd9716c934f5c7aafe2b116
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml">
4 <head>
5 <style type="text/css"> /* <![CDATA[ */
6 @import "tigris-branding/css/tigris.css";
7 @import "tigris-branding/css/inst.css";
8 /* ]]> */</style>
9 <link rel="stylesheet" type="text/css" media="print"
10 href="tigris-branding/css/print.css"/>
11 <script type="text/javascript" src="tigris-branding/scripts/tigris.js"></script>
12 <title>cvs2svn Features</title>
13 </head>
15 <body id="bodycol">
16 <div class="app">
18 <h1>cvs2svn Features</h1>
20 <p>The primary goal of cvs2svn is to migrate as much information as
21 possible from your old CVS repository to your new Subversion or git
22 repository.</p>
24 <p>Unfortunately, CVS doesn't record complete information about your
25 project's history. For example, CVS doesn't record what file
26 modifications took place together in the same CVS commit. Therefore,
27 cvs2svn attempts to infer from CVS's incomplete information what
28 <em>really</em> happened in the history of your repository. So the
29 second goal of cvs2svn is to reconstruct as much of your CVS
30 repository's history as possible.</p>
32 <p>The third goal of cvs2svn is to allow you to customize the
33 conversion process and the form of your output repository as flexibly
34 as possible. cvs2svn has very many conversion options that can be
35 used from the command line, many more that can be configured via an
36 options file, and provides many hooks to allow even more extreme
37 customization by writing Python code.</p>
39 <h2><a name="list">Feature summary</a></h2>
41 <dl>
43 <dt>No information lost</dt>
44 <dd>cvs2svn works hard to avoid losing any information from your CVS
45 repository (unless you specifically ask for a partial conversion
46 using <tt>--trunk-only</tt> or <tt>--exclude</tt>).</dd>
48 <dt>Changesets</dt>
49 <dd>CVS records modifications file-by-file, and does not keep track
50 of what files were modified at the same time. cvs2svn uses
51 information like the file modification times, log messages, and
52 dependency information to deduce the original changesets. cvs2svn
53 allows changesets that affect multiple branches and/or multiple
54 projects (as is allowed by CVS), or it can be configured to split
55 such changesets up into separate commits
56 (<tt>--no-cross-branch-commits</tt>; see also options file).</dd>
58 <dt>Multiproject conversions</dt>
59 <dd>cvs2svn can convert a CVS repository that contains multiple
60 projects into a single Subversion repository with the conventional
61 multiproject directory layout. See <a
62 href="faq.html#onetoone">the FAQ</a> for more information.</dd>
64 <dt>Branch vs. tag</dt>
65 <dd>CVS allows the same symbol name to be used sometimes as a
66 branch, sometimes as a tag. cvs2svn has options and heuristics to
67 decide how to convert such "mixed" symbols
68 (<tt>--symbol-hints</tt>, <tt>--force-branch</tt>,
69 <tt>--force-tag</tt>, <tt>--symbol-default</tt>).</dd>
71 <dt>Branch/tag exclusion</dt>
72 <dd>cvs2svn allows the user to specify branches and/or tags that
73 should be excluded from the conversion altogether
74 (<tt>--symbol-hints</tt>, <tt>--exclude</tt>). It checks that the
75 exclusions are self-consistent (e.g., it doesn't allow a branch to
76 be excluded if a branch that sprouts from it is not excluded).</dd>
78 <dt>Branch/tag renaming</dt>
79 <dd>cvs2svn can rename branches and tags during the conversion using
80 regular-expression patterns (<tt>--symbol-transform</tt>).</dd>
82 <dt>Choosing SVN paths for branches/tags</dt>
83 <dd>You can choose what SVN paths to use as the trunk/branches/tags
84 directories (<tt>--trunk</tt>, <tt>--branches</tt>,
85 <tt>--tags</tt>), or set arbitrary paths for specific CVS
86 branches/tags (<tt>--symbol-hints</tt>). For example, you might
87 want to store some tags to the <tt>project/tags</tt> directory,
88 but others to <tt>project/releases</tt>.</dd>
90 <dt>Branch and tag parents</dt>
91 <dd>In many cases, the CVS history is ambiguous about which branch
92 served as the parent of another branch or tag. cvs2svn determines
93 the most plausible parent for symbols using cross-file
94 information. You can override cvs2svn's choices on a case-by-case
95 basis by using the <tt>--symbol-hints</tt> option.</dd>
97 <dt>Branch and tag creation times</dt>
98 <dd>CVS does not record when branches and tags are created. cvs2svn
99 creates branches and tags at a reasonable time, consistent with
100 the file revisions that were tagged, and tries to create each one
101 within a single Subversion commit if possible.</dd>
103 <dt>Mime types</dt>
104 <dd>CVS does not record files' mime types. cvs2svn provides several
105 mechanisms for choosing reasonable file mime types
106 (<tt>--mime-types</tt>, <tt>--auto-props</tt>).</dd>
108 <dt>Binary vs. text</dt>
109 <dd>Many CVS users do not systematically record which files are
110 binary and which are text. (This is mostly important if the
111 repository is used on non-Unix systems.) cvs2svn provides a
112 number of ways to infer this information
113 (<tt>--eol-from-mime-type</tt>, <tt>--default-eol</tt>,
114 <tt>--keywords-off</tt>, <tt>--auto-props</tt>).</dd>
116 <dt>Subversion file properties</dt>
117 <dd>Subversion allows arbitrary text properties to be attached to
118 files. cvs2svn provides a mechanism to set such properties when a
119 file is first added to the repository
120 (<tt>--auto-props</tt>) as well as a hook that users can use to
121 set arbitrary file properties via Python code.</dd>
123 <dt>Handling of <tt>.cvsignore</tt></dt>
124 <dd><tt>.cvsignore</tt> files in the CVS repository are converted
125 into the equivalent <tt>svn:ignore</tt> properties in the output.
126 By default, the <tt>.cvsignore</tt> files themselves are
127 <em>not</em> included in the output; this behavior can be changed
128 by specifying the <tt>--keep-cvsignore</tt> option.</dd>
130 <dt>Subversion repository customization</dt>
131 <dd>cvs2svn provides many options that allow you to customize the
132 structure of the resulting Subversion repository
133 (<tt>--trunk</tt>, <tt>--branches</tt>, <tt>--tags</tt>,
134 <tt>--include-empty-directories</tt>, <tt>--no-prune</tt>,
135 <tt>--symbol-transform</tt>, etc.; see also the additional
136 customization options available by using the <a
137 href="#options-file-method"><tt>--options</tt>-file
138 method</a>).</dd>
140 <dt>Support for multiple character encodings</dt>
141 <dd>CVS does not record which character encoding was used to store
142 metainformation like file names, author names and log messages.
143 cvs2svn provides options to help convert such text into UTF-8
144 (<tt>--encoding</tt>, <tt>--fallback-encoding</tt>).</dd>
146 <dt>Vendor branches</dt>
147 <dd>CVS supports "vendor branches", which (under some circumstances)
148 affect the contents of the main line of development. cvs2svn
149 detects vendor branches whenever possible and handles them
150 intelligently. For example,
151 <ul>
153 <li>cvs2svn explicitly copies vendor branch revisions back to
154 trunk so that a checkout of trunk gives the same results under
155 SVN as under CVS.</li>
157 <li>If a vendor branch is excluded from the conversion, cvs2svn
158 grafts the relevant vendor branch revisions onto trunk so that
159 the contents of trunk are still the same as in CVS. If other
160 tags or branches sprout from these revisions, they are grafted
161 to trunk as well.</li>
163 <li>When a file is imported into CVS, CVS creates two revisions
164 ("1.1" and "1.1.1.1") with the same contents. cvs2svn
165 discards the redundant "1.1" revision in such cases (since
166 revision "1.1.1.1" will be copied to trunk anyway).</li>
168 <li>Often users create vendor branches unnecessarily by using
169 "cvs import" to import their own sources into the CVS
170 repository. Such vendor branches do not contain any useful
171 information, so by default cvs2svn excludes any vendor branch
172 that was only used for a single import. You can change this
173 default behavior by specifying the
174 <tt>--keep-trivial-imports</tt> option.</li>
176 </ul>
178 </dd>
180 <dt>CVS quirks</dt>
181 <dd>cvs2svn goes to great length to deal with CVS's many quirks.
182 For example,
183 <ul>
185 <li>CVS introduces spurious "1.1" revisions when a file is added
186 on a branch. cvs2svn discards these revisions.</li>
188 <li>If a file is added on a branch, CVS introduces a spurious
189 "dead" revision at the beginning of the branch to indicate
190 that the file did not exist when the branch was created.
191 cvs2svn deletes these spurious revisions and adds the file on
192 the branch at the correct time.</li>
194 </ul>
195 </dd>
197 <dt>Robust against repository corruption</dt>
198 <dd>cvs2svn knows how to handle several types of CVS repository
199 corruption that have been reported frequently, and gives
200 informative error messages in other cases:
201 <ul>
203 <li>An RCS file that exists both in and out of the "Attic"
204 directory.</li>
206 <li>Multiple deltatext blocks for a single CVS file
207 revision.</li>
209 <li>Multiple revision headers for the same CVS file
210 revision.</li>
212 <li>Tags and branches that refer to non-existent revisions or
213 ill-formed revision numbers.</li>
215 <li>Repeated definitions of a symbol name to the same revision
216 number.</li>
218 <li>Branches that have no associated labels.</li>
220 <li>A directory name that conflicts with a file name (in or out
221 of the Attic).</li>
223 <li>Filenames that contain forbidden characters.</li>
225 <li>Log messages with variant end-of-line styles.</li>
227 <li>Vendor branch declarations that refer to non-existent
228 branches.</li>
230 </ul>
231 </dd>
233 <dt>Timestamp error correction</dt>
234 <dd>Many CVS repositories contain timestamp errors due to servers'
235 clocks being set incorrectly during part of the repository's
236 history. cvs2svn's history reconstruction is relatively robust
237 against timestamp errors and it writes monotonic timestamps to the
238 Subversion repository.</dd>
240 <dt>Scalable</dt>
241 <dd>cvs2svn stores most intermediate data to on-disk databases so
242 that it can convert very large CVS repositories using a reasonable
243 amount of RAM. Conversions are organized as multiple passes and
244 can be restarted at an arbitrary pass in the case of
245 problems.</dd>
247 <dt>Configurable/extensible using Python</dt>
248 <dd>Many aspects of the conversion can be customized using Python
249 plugins that interact with cvs2svn through documented interfaces
250 (<tt>--options</tt>).</dd>
252 </dl>
254 </div>
255 </body>
256 </html>