1 This file contains a description of the RevisionCollector /
2 RevisionReader mechanism.
5 cvs2svn now includes hooks to make it possible to avoid having to
6 invoke CVS or RCS zillions of times in OutputPass (which is otherwise
7 the most expensive part of the conversion). Here is a brief
8 description of how the hooks work.
10 Each conversion requires an instance of RevisionReader, whose
11 responsibility is to produce the text contents of CVS revisions on
12 demand during OutputPass. The RevisionReader can read the CVS
13 revision contents directly out of the RCS files during OutputPass.
14 But additional hooks support the construction of different kinds of
15 RevisionReader that record the CVS file revisions' contents during
16 FilterSymbolsPass then output the contents during OutputPass.
17 (Indeed, for non-SVN backends, OutputPass might not even require the
20 The interface that is used during FilterSymbolsPass to allow the
21 collection of revision information is:
23 RevisionCollector -- can collect information during
24 FilterSymbolsPass to help the RevisionReader produce RCS file
25 revision contents during OutputPass.
27 The type of RevisionCollector/RevisionReader to be used for a run of
28 cvs2svn can be set using --use-internal-co, --use-rcs, or --use-cvs,
29 or via the --options file with lines like:
31 ctx.revision_collector = MyRevisionCollector()
32 ctx.revision_reader = MyRevisionReader()
34 The following RevisionReaders are supplied with cvs2svn:
36 InternalRevisionReader -- an InternalRevisionCollector records the
37 revisions' delta text and their dependencies for required
38 revisions in FilterSymbolsPass; an InternalRevisionReader
39 reconstitutes the revisions' contents during OutputPass from
40 the recorded data. This is by far the fastest option, but it
41 requires a substantial amount of temporary disk space for the
42 duration of the conversion.
44 RCSRevisionReader -- uses RCS's "co" command to extract the
45 revision text during OutputPass. This is slower than
46 InternalRevisionReader because "co" has to be executed very
47 many times, but is better tested and does not require any
48 temporary disk space. RCSRevisionReader does not use a
51 CVSRevisionReader -- uses the "cvs" command to extract the
52 revision text during OutputPass. This is even slower than
53 RCSRevisionReader, but it can handle some CVS file quirks that
54 stymy RCSRevisionReader (see the cvs2svn HTML documentation).
55 CVSRevisionReader does not use a RevisionCollector.
57 It is possible to write your own RevisionCollector and RevisionReader
58 if you would like to do things differently. A RevisionCollector, with
59 callback methods that are invoked as the CVS files are parsed, can be
60 used to collect information during FilterSymbolsPass. Its
61 process_file() method is allowed to set an arbitrary token (for
62 example, a content hash) in CVSItem.revision_reader_token. This token
63 is carried along by cvs2svn for use by the RevisionReader in
66 Later, when OutputPass requires the file contents, it calls
67 RevisionReader.get_content_stream(), which is passed a CVSRevision
68 instance and has to return a stream object that produces the file
69 revision's contents. The fancy RevisionReader could use the token to
70 retrieve the pre-stored file contents without having to call CVS or