description | Instructions for creating a git mirror of the SQLite sources |
owner | mackyle@gmail.com |
last change | Sun, 24 Jul 2022 18:17:22 +0000 (24 11:17 -0700) |
URL | git://repo.or.cz/sqlite-export.git |
https://repo.or.cz/sqlite-export.git | |
push URL | ssh://repo.or.cz/sqlite-export.git |
https://repo.or.cz/sqlite-export.git (learn more) | |
bundle info | sqlite-export.git downloadable bundles |
content tags |
As of the SQLite 3.28.0 release there now exists an official Git mirror of the SQLite software at <https://github.com/sqlite/sqlite>.
As of the Fossil 2.9 release on 2019-07-13, a new fossil git
export
command provides the ability to export a Fossil repository
to git.
This project continues to provide an alternative mechanism to export
a fossil project to Git. The new fossil 2.9 and later fossil git export
command produces a git repository that more-or-less matches one
created by using this project's --trailer
and --manifest
options
but without providing any mapping of user IDs to real user names.
Method | User Names | Options | Repository |
---|---|---|---|
fossil 2.9 | left as ID | N/A | FossilOrigin-Name trailers and manifests |
sqlite-export | mapped | <none> | No notes, trailer lines or manifest files |
sqlite-export | mapped | --notes | refs/notes/fossil records fossil check-in |
sqlite-export | mapped | --trailer | FossilOrigin-Name trailer line added |
sqlite-export | mapped | --manifest | manifest and manifest.uuid files added |
In fact, any of the three options (--notes
, --trailer
, --manifest
)
may be used with this sqlite-export project in any combination to
produce the desired output repository with whatever "extras" are
desired or not.
Repository | Producer | Extras |
---|---|---|
$GH/sqlite/sqlite | fossil 2.9+ | no user mapping, always trailers and manifests |
$repo/sqlite | sqlite-export | users mapped, fossil origin in refs/notes/fossil |
$repo/sqlite-manifest | sqlite-export | users mapped, trailers and manifests (no notes) |
$GH/mackyle/sqlite | sqlite-export | mirror of $repo/sqlite |
The $repo/sqlite repository and its $GH/mackyle/sqlite
mirror are maintained using this project and the --notes
option.
The $repo/sqlite-manifest repository is maintained using this
project and both the --trailer
and --manifest
options but not
the --notes
option and should be substantially similar to the
official git mirror of SQLite ($GH/sqlite/sqlite) except that
user ids have been mapped to user names.
Reminder
Fossil version 2.9 and later directly supports exporting the SQLite fossil repository to git. There's no reason to use this project if that export process meets your needs. See the fossil 2.9 release notes for details.
Theoretically exporting with fossil prior to version 2.9 is as simple as:
fossil export --git | git fast-import
Unfortunately it doesn't work that way and that's what this project is all about.
Note
Run the
build
script with the-h
option to see some examples of possible arguments. Any arguments passed to thebuild
script are passed along to the fossilconfigure
script during the build process. Most systems will not require any arguments be passed to thebuild
script.
Run the script build
to fetch and build a suitable fossil tool and a
git-export-filter
tool.
Run the script import
(maybe with the --notes
and/or other options)
to create an sqlite.git
Git clone of the <https://sqlite.org/src> fossil
sources. (May take up to 60 minutes.)
See the "Building" section at the bottom of this README to "make" SQLite.
There are two problems with fossil export:
fossil versions starting with 1.18 mangle export branch and tag names to
avoid including characters git does not allow. The problem is that many
more characters are mangled than needed so that a tag like version-1.18
is converted to version_1_18
unnecessarily.
fossil versions after 1.18 produce a Git fast-import data stream that
causes git fast-import
to fail with a fatal error.
The fossil change that introduces tag mangling is here:
It was a well-intentioned change as previously invalid Git names would be exported, but it went way, way too far. In fact, the actual Git rules about allowable characters in names are:
A patch is included in the file patches/export_c_patch_diff.txt
that allows
the full diversity of git names to be used and should be applied to the fossil
src/export.c
file of fossil version 2.1 before building fossil. It also adds
an optional --notes option to the fossil export --git command that if given
will add a note in the refs/notes/fossil namespace to each commit giving the
original fossil check-in hash for the commit. Furthermore, it also provides a
new --use-done-feature option (see git help fast-import
) and makes sure there
aren't any whitespace issues with commit messages by transforming CRLF into
just LF and making sure the only whitespace at the end is a single LF.
There may be updates coming to the official fossil release to address this name mangling problem, but as of fossil 2.1 they have yet to make it into any official fossil release.
The fossil change that introduces the export problem is here:
There is even a ticket about this "timewarp" export issue here:
This issue affects the sqlite
, sqlite_docsrc
and fossil
repositories
making it impossible to export them from fossil and import them into Git with
a current version of fossil.
The fossil ticket linked to in the above "The Export Problem" section talks about "timewarps". These are simply check-ins with a timestamp that is earlier than at least one of their parents (merges have two parents, most others one).
Fossil doesn't much like these. The Git fast-import format is a "streamy" format that, while it allows back references to things earlier in the stream, does not allow forward references to future, prospective data. Fossil likes to output its fast-import stream in check-in date order. And there you see the issue. If a "timewarp" is present then children get put out before their parents arrive, and Git rudely ends the fast-import operation when this occurs.
All three of the primary fossil repositories (SQLite, SQLite Docs, Fossil) have at least one "timewarp" in them.
Fossil versions 1.18 and earlier produce a usable fast-import stream not because it orders the output check-ins correctly in spite of the "timewarp", but because it outputs all data for each check-in rather than outputting only differences from the parent(s). So while the output isn't really correct, it is accepted by Git and when outside the "timewarp" portion of the history, the converted Git commits have exactly the correct set of sources, so it's really not much more than a minor annoyance when reviewing very tiny parts of older history in the repository.
Starting with fossil version 1.19 this all changed. Now, whenever possible, the exported Git fast-import stream only includes "changes" from a check-in's parent(s). With a sloppy ordering based only on check-in timestamp and in the presence of "timewarps", children get put out before their parent(s) arrive with the ensuing Git rudeness. While, on the surface, this seems like a good change (and it brought the ability to do incremental exports), full exports seem to take somewhat longer overall now.
Then on 2017-02-23, they "shattered it" <https://shattered.it/>.
Shortly thereafter fossil version 2.0 came out supporting additional hash functions. And on 2017-03-12 the official SQLite fossil repository got its first check-in using the new hash function. Versions of fossil prior to 2.0 cannot deal with these new hash function values.
Now you see the problem. Fossil version 1.18 can no longer be used (even with its technically incorrect output) as it cannot understand the new hash values. But fossil versions 1.19 and later (including 2.0) cannot be used either since they produce a completely unacceptable fast-import stream in the presence of any "timewarps".
But, curiosity is a harsh mistress. The topological ordering problem was
solved even for fossil 1.18 in a satisfactory way some time ago but never
published to avoid causing all the Git refs values to be force-updated.
Correcting the misordering caused by the "timewarps" alters the DAG (directed
acyclic graph) of check-in ancestry and that trickles down to all the children
causing all of their commit
hash values to change even though the sources they
refer to remain completely unchanged.
As of 2017-03-12 there really isn't a choice anymore.
A GPL version 2 (or later) patch is included to address this in the file
patches/export_topo_patch_diff.txt
that provides a guaranteed topological
ordering to the exported fast-import stream. When it's built into fossil,
that version of fossil becomes also covered by the GPL. The repository data
fossil maintains is unaffected by fossil's license(s) so having a GPL-covered
fossil binary should not really affect anyone.
A .tar.gz archive of the fossil 2.1 sources may be fetched from:
<https://fossil-scm.org/index.html/uv/fossil-src-2.1.tar.gz>
The downloaded .tar.gz file should have these size and hash values:
size: 4802504 bytes
md5: 9f32b23cecb092d42cdf11bf003ebf8d
sha1: 7c7387efb4c0016de6e836dba6f7842246825678
sha256: 85dcdf10d0f1be41eef53839c6faaa73d2498a9a140a89327cfb092f23cfef05
The archives
subdirectory contains a copy of this .tar.gz file and it will
be used by the build
script to create a fossil
executable that reports its
version as 2.1+export
to confirm that it contains the export fixes.
The Git fast-import facility does not provide a means to filter the incoming data stream to adjust user names (fossil export data only includes the user login name as the email address) nor a means to adjust branch/tag names (fossil exports a 'trunk' branch where Git expects a 'master' branch and fossil also exports what are essentially lightweight tags as annotated tags).
To deal with these issues, the git-export-filter
utility is used.
It can be found at:
The included sqlite_authors
file is used with the git-export-filter
tool to
supply real user names and email addresses. Also note that the sqlite_authors
file also works for the <https://sqlite.org/docsrc> fossil repository as well.
After building a patched version of fossil 2.1 as described above and the
git-export-filter
utility, a Git repository of the SQLite sources can be
created like so (which is what the import --notes
script does):
fossil clone https://sqlite.org/src sqlite.fsl git --git-dir=sqlite.git init --bare fossil export --git --notes sqlite.fsl | git-export-filter --authors-file sqlite_authors --require-authors \ --trunk-is-master --convert-tagger tagger | git --git-dir=sqlite.git fast-import
The above will create the sqlite.git
Git repository that is a clone of the
SQLite sources from the SQLite fossil respository <https://sqlite.org/src>
(note that only sources are cloned, not tickets or wiki pages or events).
The provided build
script will attempt to download the necessary sources,
patch them and build suitable fossil
and git-export-filter
executable files.
It will pass along any arguments directly to the fossil configure
script.
Run the build
script with the -h
option for examples (most systems will not
require any arguments be passed to the build
script).
The provided import
script will then attempt to clone the SQLite sources
and convert them into an sqlite.git
repository. It may be run again to update
the sqlite.git
repository with new changes. It accepts the --notes
option
(which is recommended) to enable generation of the refs/notes/fossil
notes
containing the original fossil check-in hash. It also accepts the --trailer
and --manifest
options which may be used in any combination with or without
the --notes
option.
The initial run of the import
script may take up to 60 minutes on a fast
machine, and subsequent runs of import
even on a fast machine will still,
unfortunately, take some time. The CPU will be pounded in either case.
IMPORTANT
Options passed to the import
script are not remembered, so make sure to
pass the same options, (e.g. --notes
) to the import
script every time it's
run if it's being used to update a previously exported Git repository or you
may end up with out-of-date notes and/or mismatched trailer/manifest commits.
There are new options provided by the patch files for the fossil export
command. As a convenience, they may be given to the import
script which
will just pass them on to the fossil export
command.
--notes
Included with the export tags patch a new fossil export option
--notes
is provided that adds a Git commit note to therefs/notes/fossil
namespace which contains the original fossil check-in hash for each fossil checkin exported to Git. Usegit log --notes=fossil
to see these notes.
--trailer
Included with the export tags patch a new fossil export option
--trailer
is provided that adds a "FossilOrigin-Name:" trailer line to each commit created in the git repository that includes the original fossil check-in hash for that commit.
--manifest
Included with the export tags patch a new fossil export option
--manifest
is provided that causes every commit created in the git repository to include amanifest
andmanifest.uuid
file. Use of this option will increase the size of the generated git repository by approximately 25%.
--use-done-feature
Included with the export tags patch a new fossil export option
--use-done-feature
is provided that includes thefeature done
anddone
commands at the beginning and end respectively of the exported fast-import stream. This can help avoid partial imports. See thegit help fast-import
description of the--done
option and thegit help fast-export
description of the--use-done-feature
option.
QUICKLY
Clone/checkout the new sqlite.git
repository into a new working tree
Run the create-fossil-manifest
script from this repository with the
current working directory set to the new working tree created in (1)
Now run the configure
script in the new working tree created in (1)
Now run make
in the new working tree created in (1)
DETAILS
Ideally, simply cloning from the new sqlite.git
repository would allow one to
then build SQLite by simply using make
(or configure
and make
).
Unfortunately, this is not the case, the make will fail with a message about
no rule to make the files manifest
and/or manifest.uuid
unless the
--manifest
option was passed to the import
script.
Both the SQLite sources and the Fossil sources require two fossil vcs specific
files to be created (manifest
and manifest.uuid
) in order for make to be
successful. When the --manifest
option is passed to the import
script
these files are added to every commit in the generated git repository which
increases the repository size by roughly 25%.
The manifest.uuid
file simply contains the hash of the current checkout
and while a real manifest
file contains a bunch of information, the only
thing that need be present is a line containing the UTC ISO date preceded
by 'D '.
The create-fossil-manifest
script takes care of creating these files and
should be run with the current working directory set to the top-level of the
git clone's working directory if the --manifest
option was NOT passed to
the import
script.
Any time the HEAD commit changes, the create-fossil-manifest
script should
be run to update the manifest
and manifest.uuid
files (only if the
--manifest
option was NOT passed to the import
script) before next
running make or the output of the sqlite_source_id()
function will be
incorrect.
7 years ago | v1.18.fix+ | fossil version 1.18.fix+ | tag | commitlog |
21 months ago | master | logtree |