Autogenerated HTML docs for v2.42.0-526-g3130c1
[git-htmldocs.git] / technical / bitmap-format.html
blobfb76b391cf9d8eb114d24bd08bf4137428d8a089
1 <?xml version="1.0" encoding="UTF-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
3 "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
5 <head>
6 <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=UTF-8" />
7 <meta name="generator" content="AsciiDoc 10.2.0" />
8 <title>GIT bitmap v1 format</title>
9 <style type="text/css">
10 /* Shared CSS for AsciiDoc xhtml11 and html5 backends */
12 /* Default font. */
13 body {
14 font-family: Georgia,serif;
17 /* Title font. */
18 h1, h2, h3, h4, h5, h6,
19 div.title, caption.title,
20 thead, p.table.header,
21 #toctitle,
22 #author, #revnumber, #revdate, #revremark,
23 #footer {
24 font-family: Arial,Helvetica,sans-serif;
27 body {
28 margin: 1em 5% 1em 5%;
31 a {
32 color: blue;
33 text-decoration: underline;
35 a:visited {
36 color: fuchsia;
39 em {
40 font-style: italic;
41 color: navy;
44 strong {
45 font-weight: bold;
46 color: #083194;
49 h1, h2, h3, h4, h5, h6 {
50 color: #527bbd;
51 margin-top: 1.2em;
52 margin-bottom: 0.5em;
53 line-height: 1.3;
56 h1, h2, h3 {
57 border-bottom: 2px solid silver;
59 h2 {
60 padding-top: 0.5em;
62 h3 {
63 float: left;
65 h3 + * {
66 clear: left;
68 h5 {
69 font-size: 1.0em;
72 div.sectionbody {
73 margin-left: 0;
76 hr {
77 border: 1px solid silver;
80 p {
81 margin-top: 0.5em;
82 margin-bottom: 0.5em;
85 ul, ol, li > p {
86 margin-top: 0;
88 ul > li { color: #aaa; }
89 ul > li > * { color: black; }
91 .monospaced, code, pre {
92 font-family: "Courier New", Courier, monospace;
93 font-size: inherit;
94 color: navy;
95 padding: 0;
96 margin: 0;
98 pre {
99 white-space: pre-wrap;
102 #author {
103 color: #527bbd;
104 font-weight: bold;
105 font-size: 1.1em;
107 #email {
109 #revnumber, #revdate, #revremark {
112 #footer {
113 font-size: small;
114 border-top: 2px solid silver;
115 padding-top: 0.5em;
116 margin-top: 4.0em;
118 #footer-text {
119 float: left;
120 padding-bottom: 0.5em;
122 #footer-badges {
123 float: right;
124 padding-bottom: 0.5em;
127 #preamble {
128 margin-top: 1.5em;
129 margin-bottom: 1.5em;
131 div.imageblock, div.exampleblock, div.verseblock,
132 div.quoteblock, div.literalblock, div.listingblock, div.sidebarblock,
133 div.admonitionblock {
134 margin-top: 1.0em;
135 margin-bottom: 1.5em;
137 div.admonitionblock {
138 margin-top: 2.0em;
139 margin-bottom: 2.0em;
140 margin-right: 10%;
141 color: #606060;
144 div.content { /* Block element content. */
145 padding: 0;
148 /* Block element titles. */
149 div.title, caption.title {
150 color: #527bbd;
151 font-weight: bold;
152 text-align: left;
153 margin-top: 1.0em;
154 margin-bottom: 0.5em;
156 div.title + * {
157 margin-top: 0;
160 td div.title:first-child {
161 margin-top: 0.0em;
163 div.content div.title:first-child {
164 margin-top: 0.0em;
166 div.content + div.title {
167 margin-top: 0.0em;
170 div.sidebarblock > div.content {
171 background: #ffffee;
172 border: 1px solid #dddddd;
173 border-left: 4px solid #f0f0f0;
174 padding: 0.5em;
177 div.listingblock > div.content {
178 border: 1px solid #dddddd;
179 border-left: 5px solid #f0f0f0;
180 background: #f8f8f8;
181 padding: 0.5em;
184 div.quoteblock, div.verseblock {
185 padding-left: 1.0em;
186 margin-left: 1.0em;
187 margin-right: 10%;
188 border-left: 5px solid #f0f0f0;
189 color: #888;
192 div.quoteblock > div.attribution {
193 padding-top: 0.5em;
194 text-align: right;
197 div.verseblock > pre.content {
198 font-family: inherit;
199 font-size: inherit;
201 div.verseblock > div.attribution {
202 padding-top: 0.75em;
203 text-align: left;
205 /* DEPRECATED: Pre version 8.2.7 verse style literal block. */
206 div.verseblock + div.attribution {
207 text-align: left;
210 div.admonitionblock .icon {
211 vertical-align: top;
212 font-size: 1.1em;
213 font-weight: bold;
214 text-decoration: underline;
215 color: #527bbd;
216 padding-right: 0.5em;
218 div.admonitionblock td.content {
219 padding-left: 0.5em;
220 border-left: 3px solid #dddddd;
223 div.exampleblock > div.content {
224 border-left: 3px solid #dddddd;
225 padding-left: 0.5em;
228 div.imageblock div.content { padding-left: 0; }
229 span.image img { border-style: none; vertical-align: text-bottom; }
230 a.image:visited { color: white; }
232 dl {
233 margin-top: 0.8em;
234 margin-bottom: 0.8em;
236 dt {
237 margin-top: 0.5em;
238 margin-bottom: 0;
239 font-style: normal;
240 color: navy;
242 dd > *:first-child {
243 margin-top: 0.1em;
246 ul, ol {
247 list-style-position: outside;
249 ol.arabic {
250 list-style-type: decimal;
252 ol.loweralpha {
253 list-style-type: lower-alpha;
255 ol.upperalpha {
256 list-style-type: upper-alpha;
258 ol.lowerroman {
259 list-style-type: lower-roman;
261 ol.upperroman {
262 list-style-type: upper-roman;
265 div.compact ul, div.compact ol,
266 div.compact p, div.compact p,
267 div.compact div, div.compact div {
268 margin-top: 0.1em;
269 margin-bottom: 0.1em;
272 tfoot {
273 font-weight: bold;
275 td > div.verse {
276 white-space: pre;
279 div.hdlist {
280 margin-top: 0.8em;
281 margin-bottom: 0.8em;
283 div.hdlist tr {
284 padding-bottom: 15px;
286 dt.hdlist1.strong, td.hdlist1.strong {
287 font-weight: bold;
289 td.hdlist1 {
290 vertical-align: top;
291 font-style: normal;
292 padding-right: 0.8em;
293 color: navy;
295 td.hdlist2 {
296 vertical-align: top;
298 div.hdlist.compact tr {
299 margin: 0;
300 padding-bottom: 0;
303 .comment {
304 background: yellow;
307 .footnote, .footnoteref {
308 font-size: 0.8em;
311 span.footnote, span.footnoteref {
312 vertical-align: super;
315 #footnotes {
316 margin: 20px 0 20px 0;
317 padding: 7px 0 0 0;
320 #footnotes div.footnote {
321 margin: 0 0 5px 0;
324 #footnotes hr {
325 border: none;
326 border-top: 1px solid silver;
327 height: 1px;
328 text-align: left;
329 margin-left: 0;
330 width: 20%;
331 min-width: 100px;
334 div.colist td {
335 padding-right: 0.5em;
336 padding-bottom: 0.3em;
337 vertical-align: top;
339 div.colist td img {
340 margin-top: 0.3em;
343 @media print {
344 #footer-badges { display: none; }
347 #toc {
348 margin-bottom: 2.5em;
351 #toctitle {
352 color: #527bbd;
353 font-size: 1.1em;
354 font-weight: bold;
355 margin-top: 1.0em;
356 margin-bottom: 0.1em;
359 div.toclevel0, div.toclevel1, div.toclevel2, div.toclevel3, div.toclevel4 {
360 margin-top: 0;
361 margin-bottom: 0;
363 div.toclevel2 {
364 margin-left: 2em;
365 font-size: 0.9em;
367 div.toclevel3 {
368 margin-left: 4em;
369 font-size: 0.9em;
371 div.toclevel4 {
372 margin-left: 6em;
373 font-size: 0.9em;
376 span.aqua { color: aqua; }
377 span.black { color: black; }
378 span.blue { color: blue; }
379 span.fuchsia { color: fuchsia; }
380 span.gray { color: gray; }
381 span.green { color: green; }
382 span.lime { color: lime; }
383 span.maroon { color: maroon; }
384 span.navy { color: navy; }
385 span.olive { color: olive; }
386 span.purple { color: purple; }
387 span.red { color: red; }
388 span.silver { color: silver; }
389 span.teal { color: teal; }
390 span.white { color: white; }
391 span.yellow { color: yellow; }
393 span.aqua-background { background: aqua; }
394 span.black-background { background: black; }
395 span.blue-background { background: blue; }
396 span.fuchsia-background { background: fuchsia; }
397 span.gray-background { background: gray; }
398 span.green-background { background: green; }
399 span.lime-background { background: lime; }
400 span.maroon-background { background: maroon; }
401 span.navy-background { background: navy; }
402 span.olive-background { background: olive; }
403 span.purple-background { background: purple; }
404 span.red-background { background: red; }
405 span.silver-background { background: silver; }
406 span.teal-background { background: teal; }
407 span.white-background { background: white; }
408 span.yellow-background { background: yellow; }
410 span.big { font-size: 2em; }
411 span.small { font-size: 0.6em; }
413 span.underline { text-decoration: underline; }
414 span.overline { text-decoration: overline; }
415 span.line-through { text-decoration: line-through; }
417 div.unbreakable { page-break-inside: avoid; }
421 * xhtml11 specific
423 * */
425 div.tableblock {
426 margin-top: 1.0em;
427 margin-bottom: 1.5em;
429 div.tableblock > table {
430 border: 3px solid #527bbd;
432 thead, p.table.header {
433 font-weight: bold;
434 color: #527bbd;
436 p.table {
437 margin-top: 0;
439 /* Because the table frame attribute is overridden by CSS in most browsers. */
440 div.tableblock > table[frame="void"] {
441 border-style: none;
443 div.tableblock > table[frame="hsides"] {
444 border-left-style: none;
445 border-right-style: none;
447 div.tableblock > table[frame="vsides"] {
448 border-top-style: none;
449 border-bottom-style: none;
454 * html5 specific
456 * */
458 table.tableblock {
459 margin-top: 1.0em;
460 margin-bottom: 1.5em;
462 thead, p.tableblock.header {
463 font-weight: bold;
464 color: #527bbd;
466 p.tableblock {
467 margin-top: 0;
469 table.tableblock {
470 border-width: 3px;
471 border-spacing: 0px;
472 border-style: solid;
473 border-color: #527bbd;
474 border-collapse: collapse;
476 th.tableblock, td.tableblock {
477 border-width: 1px;
478 padding: 4px;
479 border-style: solid;
480 border-color: #527bbd;
483 table.tableblock.frame-topbot {
484 border-left-style: hidden;
485 border-right-style: hidden;
487 table.tableblock.frame-sides {
488 border-top-style: hidden;
489 border-bottom-style: hidden;
491 table.tableblock.frame-none {
492 border-style: hidden;
495 th.tableblock.halign-left, td.tableblock.halign-left {
496 text-align: left;
498 th.tableblock.halign-center, td.tableblock.halign-center {
499 text-align: center;
501 th.tableblock.halign-right, td.tableblock.halign-right {
502 text-align: right;
505 th.tableblock.valign-top, td.tableblock.valign-top {
506 vertical-align: top;
508 th.tableblock.valign-middle, td.tableblock.valign-middle {
509 vertical-align: middle;
511 th.tableblock.valign-bottom, td.tableblock.valign-bottom {
512 vertical-align: bottom;
517 * manpage specific
519 * */
521 body.manpage h1 {
522 padding-top: 0.5em;
523 padding-bottom: 0.5em;
524 border-top: 2px solid silver;
525 border-bottom: 2px solid silver;
527 body.manpage h2 {
528 border-style: none;
530 body.manpage div.sectionbody {
531 margin-left: 3em;
534 @media print {
535 body.manpage div#toc { display: none; }
539 </style>
540 <script type="text/javascript">
541 /*<![CDATA[*/
542 var asciidoc = { // Namespace.
544 /////////////////////////////////////////////////////////////////////
545 // Table Of Contents generator
546 /////////////////////////////////////////////////////////////////////
548 /* Author: Mihai Bazon, September 2002
549 * http://students.infoiasi.ro/~mishoo
551 * Table Of Content generator
552 * Version: 0.4
554 * Feel free to use this script under the terms of the GNU General Public
555 * License, as long as you do not remove or alter this notice.
558 /* modified by Troy D. Hanson, September 2006. License: GPL */
559 /* modified by Stuart Rackham, 2006, 2009. License: GPL */
561 // toclevels = 1..4.
562 toc: function (toclevels) {
564 function getText(el) {
565 var text = "";
566 for (var i = el.firstChild; i != null; i = i.nextSibling) {
567 if (i.nodeType == 3 /* Node.TEXT_NODE */) // IE doesn't speak constants.
568 text += i.data;
569 else if (i.firstChild != null)
570 text += getText(i);
572 return text;
575 function TocEntry(el, text, toclevel) {
576 this.element = el;
577 this.text = text;
578 this.toclevel = toclevel;
581 function tocEntries(el, toclevels) {
582 var result = new Array;
583 var re = new RegExp('[hH]([1-'+(toclevels+1)+'])');
584 // Function that scans the DOM tree for header elements (the DOM2
585 // nodeIterator API would be a better technique but not supported by all
586 // browsers).
587 var iterate = function (el) {
588 for (var i = el.firstChild; i != null; i = i.nextSibling) {
589 if (i.nodeType == 1 /* Node.ELEMENT_NODE */) {
590 var mo = re.exec(i.tagName);
591 if (mo && (i.getAttribute("class") || i.getAttribute("className")) != "float") {
592 result[result.length] = new TocEntry(i, getText(i), mo[1]-1);
594 iterate(i);
598 iterate(el);
599 return result;
602 var toc = document.getElementById("toc");
603 if (!toc) {
604 return;
607 // Delete existing TOC entries in case we're reloading the TOC.
608 var tocEntriesToRemove = [];
609 var i;
610 for (i = 0; i < toc.childNodes.length; i++) {
611 var entry = toc.childNodes[i];
612 if (entry.nodeName.toLowerCase() == 'div'
613 && entry.getAttribute("class")
614 && entry.getAttribute("class").match(/^toclevel/))
615 tocEntriesToRemove.push(entry);
617 for (i = 0; i < tocEntriesToRemove.length; i++) {
618 toc.removeChild(tocEntriesToRemove[i]);
621 // Rebuild TOC entries.
622 var entries = tocEntries(document.getElementById("content"), toclevels);
623 for (var i = 0; i < entries.length; ++i) {
624 var entry = entries[i];
625 if (entry.element.id == "")
626 entry.element.id = "_toc_" + i;
627 var a = document.createElement("a");
628 a.href = "#" + entry.element.id;
629 a.appendChild(document.createTextNode(entry.text));
630 var div = document.createElement("div");
631 div.appendChild(a);
632 div.className = "toclevel" + entry.toclevel;
633 toc.appendChild(div);
635 if (entries.length == 0)
636 toc.parentNode.removeChild(toc);
640 /////////////////////////////////////////////////////////////////////
641 // Footnotes generator
642 /////////////////////////////////////////////////////////////////////
644 /* Based on footnote generation code from:
645 * http://www.brandspankingnew.net/archive/2005/07/format_footnote.html
648 footnotes: function () {
649 // Delete existing footnote entries in case we're reloading the footnodes.
650 var i;
651 var noteholder = document.getElementById("footnotes");
652 if (!noteholder) {
653 return;
655 var entriesToRemove = [];
656 for (i = 0; i < noteholder.childNodes.length; i++) {
657 var entry = noteholder.childNodes[i];
658 if (entry.nodeName.toLowerCase() == 'div' && entry.getAttribute("class") == "footnote")
659 entriesToRemove.push(entry);
661 for (i = 0; i < entriesToRemove.length; i++) {
662 noteholder.removeChild(entriesToRemove[i]);
665 // Rebuild footnote entries.
666 var cont = document.getElementById("content");
667 var spans = cont.getElementsByTagName("span");
668 var refs = {};
669 var n = 0;
670 for (i=0; i<spans.length; i++) {
671 if (spans[i].className == "footnote") {
672 n++;
673 var note = spans[i].getAttribute("data-note");
674 if (!note) {
675 // Use [\s\S] in place of . so multi-line matches work.
676 // Because JavaScript has no s (dotall) regex flag.
677 note = spans[i].innerHTML.match(/\s*\[([\s\S]*)]\s*/)[1];
678 spans[i].innerHTML =
679 "[<a id='_footnoteref_" + n + "' href='#_footnote_" + n +
680 "' title='View footnote' class='footnote'>" + n + "</a>]";
681 spans[i].setAttribute("data-note", note);
683 noteholder.innerHTML +=
684 "<div class='footnote' id='_footnote_" + n + "'>" +
685 "<a href='#_footnoteref_" + n + "' title='Return to text'>" +
686 n + "</a>. " + note + "</div>";
687 var id =spans[i].getAttribute("id");
688 if (id != null) refs["#"+id] = n;
691 if (n == 0)
692 noteholder.parentNode.removeChild(noteholder);
693 else {
694 // Process footnoterefs.
695 for (i=0; i<spans.length; i++) {
696 if (spans[i].className == "footnoteref") {
697 var href = spans[i].getElementsByTagName("a")[0].getAttribute("href");
698 href = href.match(/#.*/)[0]; // Because IE return full URL.
699 n = refs[href];
700 spans[i].innerHTML =
701 "[<a href='#_footnote_" + n +
702 "' title='View footnote' class='footnote'>" + n + "</a>]";
708 install: function(toclevels) {
709 var timerId;
711 function reinstall() {
712 asciidoc.footnotes();
713 if (toclevels) {
714 asciidoc.toc(toclevels);
718 function reinstallAndRemoveTimer() {
719 clearInterval(timerId);
720 reinstall();
723 timerId = setInterval(reinstall, 500);
724 if (document.addEventListener)
725 document.addEventListener("DOMContentLoaded", reinstallAndRemoveTimer, false);
726 else
727 window.onload = reinstallAndRemoveTimer;
731 asciidoc.install();
732 /*]]>*/
733 </script>
734 </head>
735 <body class="article">
736 <div id="header">
737 <h1>GIT bitmap v1 format</h1>
738 <span id="revdate">2023-10-29</span>
739 </div>
740 <div id="content">
741 <div class="sect1">
742 <h2 id="_pack_and_multi_pack_bitmaps">Pack and multi-pack bitmaps</h2>
743 <div class="sectionbody">
744 <div class="paragraph"><p>Bitmaps store reachability information about the set of objects in a packfile,
745 or a multi-pack index (MIDX). The former is defined obviously, and the latter is
746 defined as the union of objects in packs contained in the MIDX.</p></div>
747 <div class="paragraph"><p>A bitmap may belong to either one pack, or the repository&#8217;s multi-pack index (if
748 it exists). A repository may have at most one bitmap.</p></div>
749 <div class="paragraph"><p>An object is uniquely described by its bit position within a bitmap:</p></div>
750 <div class="ulist"><ul>
751 <li>
753 If the bitmap belongs to a packfile, the <em>n</em>th bit corresponds to
754 the <em>n</em>th object in pack order. For a function <code>offset</code> which maps
755 objects to their byte offset within a pack, pack order is defined as
756 follows:
757 </p>
758 <div class="literalblock">
759 <div class="content">
760 <pre><code>o1 &lt;= o2 &lt;==&gt; offset(o1) &lt;= offset(o2)</code></pre>
761 </div></div>
762 </li>
763 <li>
765 If the bitmap belongs to a MIDX, the <em>n</em>th bit corresponds to the
766 <em>n</em>th object in MIDX order. With an additional function <code>pack</code> which
767 maps objects to the pack they were selected from by the MIDX, MIDX order
768 is defined as follows:
769 </p>
770 <div class="literalblock">
771 <div class="content">
772 <pre><code>o1 &lt;= o2 &lt;==&gt; pack(o1) &lt;= pack(o2) /\ offset(o1) &lt;= offset(o2)</code></pre>
773 </div></div>
774 <div class="paragraph"><p>The ordering between packs is done according to the MIDX&#8217;s .rev file.
775 Notably, the preferred pack sorts ahead of all other packs.</p></div>
776 </li>
777 </ul></div>
778 <div class="paragraph"><p>The on-disk representation (described below) of a bitmap is the same regardless
779 of whether or not that bitmap belongs to a packfile or a MIDX. The only
780 difference is the interpretation of the bits, which is described above.</p></div>
781 <div class="paragraph"><p>Certain bitmap extensions are supported (see: Appendix B). No extensions are
782 required for bitmaps corresponding to packfiles. For bitmaps that correspond to
783 MIDXs, both the bit-cache and rev-cache extensions are required.</p></div>
784 </div>
785 </div>
786 <div class="sect1">
787 <h2 id="_on_disk_format">On-disk format</h2>
788 <div class="sectionbody">
789 <div class="ulist"><ul>
790 <li>
792 A header appears at the beginning:
793 </p>
794 <div class="dlist"><dl>
795 <dt class="hdlist1">
796 4-byte signature:
797 </dt>
798 <dd>
800 {<em>B</em>, <em>I</em>, <em>T</em>, <em>M</em>}
801 </p>
802 </dd>
803 <dt class="hdlist1">
804 2-byte version number (network byte order):
805 </dt>
806 <dd>
808 The current implementation only supports version 1
809 of the bitmap index (the same one as JGit).
810 </p>
811 </dd>
812 <dt class="hdlist1">
813 2-byte flags (network byte order):
814 </dt>
815 <dd>
817 The following flags are supported:
818 </p>
819 <div class="ulist"><ul>
820 <li>
822 </p>
823 <div class="dlist"><dl>
824 <dt class="hdlist1">
825 BITMAP_OPT_FULL_DAG (0x1) REQUIRED:
826 </dt>
827 <dd>
829 This flag must always be present. It implies that the
830 bitmap index has been generated for a packfile or
831 multi-pack index (MIDX) with full closure (i.e. where
832 every single object in the packfile/MIDX can find its
833 parent links inside the same packfile/MIDX). This is a
834 requirement for the bitmap index format, also present in
835 JGit, that greatly reduces the complexity of the
836 implementation.
837 </p>
838 </dd>
839 </dl></div>
840 </li>
841 <li>
843 </p>
844 <div class="dlist"><dl>
845 <dt class="hdlist1">
846 BITMAP_OPT_HASH_CACHE (0x4):
847 </dt>
848 <dd>
850 If present, the end of the bitmap file contains
851 <code>N</code> 32-bit name-hash values, one per object in the
852 pack/MIDX. The format and meaning of the name-hash is
853 described below.
854 </p>
855 </dd>
856 </dl></div>
857 </li>
858 <li>
860 </p>
861 <div class="dlist"><dl>
862 <dt class="hdlist1">
863 BITMAP_OPT_LOOKUP_TABLE (0x10):
864 </dt>
865 <dd>
867 If present, the end of the bitmap file contains a table
868 containing a list of <code>N</code> &lt;commit_pos, offset, xor_row&gt;
869 triplets. The format and meaning of the table is described
870 below.
871 </p>
872 <div class="admonitionblock">
873 <table><tr>
874 <td class="icon">
875 <div class="title">Note</div>
876 </td>
877 <td class="content">Unlike the xor_offset used to compress an individual bitmap,
878 <code>xor_row</code> stores an <strong>absolute</strong> index into the lookup table, not a location
879 relative to the current entry.</td>
880 </tr></table>
881 </div>
882 </dd>
883 </dl></div>
884 </li>
885 </ul></div>
886 </dd>
887 <dt class="hdlist1">
888 4-byte entry count (network byte order):
889 </dt>
890 <dd>
892 The total count of entries (bitmapped commits) in this bitmap index.
893 </p>
894 </dd>
895 <dt class="hdlist1">
896 20-byte checksum:
897 </dt>
898 <dd>
900 The SHA1 checksum of the pack/MIDX this bitmap index
901 belongs to.
902 </p>
903 </dd>
904 </dl></div>
905 </li>
906 <li>
908 4 EWAH bitmaps that act as type indexes
909 </p>
910 <div class="paragraph"><p>Type indexes are serialized after the hash cache in the shape
911 of four EWAH bitmaps stored consecutively (see Appendix A for
912 the serialization format of an EWAH bitmap).</p></div>
913 <div class="paragraph"><p>There is a bitmap for each Git object type, stored in the following
914 order:</p></div>
915 <div class="ulist"><ul>
916 <li>
918 Commits
919 </p>
920 </li>
921 <li>
923 Trees
924 </p>
925 </li>
926 <li>
928 Blobs
929 </p>
930 </li>
931 <li>
933 Tags
934 </p>
935 </li>
936 </ul></div>
937 <div class="paragraph"><p>In each bitmap, the `n`th bit is set to true if the `n`th object
938 in the packfile or multi-pack index is of that type.</p></div>
939 <div class="paragraph"><p>The obvious consequence is that the OR of all 4 bitmaps will result
940 in a full set (all bits set), and the AND of all 4 bitmaps will
941 result in an empty bitmap (no bits set).</p></div>
942 </li>
943 <li>
945 N entries with compressed bitmaps, one for each indexed commit
946 </p>
947 <div class="paragraph"><p>Where <code>N</code> is the total number of entries in this bitmap index.
948 Each entry contains the following:</p></div>
949 <div class="ulist"><ul>
950 <li>
952 </p>
953 <div class="dlist"><dl>
954 <dt class="hdlist1">
955 4-byte object position (network byte order):
956 </dt>
957 <dd>
959 The position <strong>in the index for the packfile or
960 multi-pack index</strong> where the bitmap for this commit is
961 found.
962 </p>
963 </dd>
964 </dl></div>
965 </li>
966 <li>
968 </p>
969 <div class="dlist"><dl>
970 <dt class="hdlist1">
971 1-byte XOR-offset:
972 </dt>
973 <dd>
975 The xor offset used to compress this bitmap. For an entry
976 in position <code>x</code>, an XOR offset of <code>y</code> means that the actual
977 bitmap representing this commit is composed by XORing the
978 bitmap for this entry with the bitmap in entry <code>x-y</code> (i.e.
979 the bitmap <code>y</code> entries before this one).
980 </p>
981 <div class="admonitionblock">
982 <table><tr>
983 <td class="icon">
984 <div class="title">Note</div>
985 </td>
986 <td class="content">This compression can be recursive. In order to
987 XOR this entry with a previous one, the previous entry needs
988 to be decompressed first, and so on.</td>
989 </tr></table>
990 </div>
991 <div class="paragraph"><p>The hard-limit for this offset is 160 (an entry can only be
992 xor&#8217;ed against one of the 160 entries preceding it). This
993 number is always positive, and hence entries are always xor&#8217;ed
994 with <strong>previous</strong> bitmaps, not bitmaps that will come afterwards
995 in the index.</p></div>
996 </dd>
997 </dl></div>
998 </li>
999 <li>
1001 </p>
1002 <div class="dlist"><dl>
1003 <dt class="hdlist1">
1004 1-byte flags for this bitmap:
1005 </dt>
1006 <dd>
1008 At the moment the only available flag is <code>0x1</code>, which hints
1009 that this bitmap can be re-used when rebuilding bitmap indexes
1010 for the repository.
1011 </p>
1012 </dd>
1013 </dl></div>
1014 </li>
1015 <li>
1017 The compressed bitmap itself, see Appendix A.
1018 </p>
1019 </li>
1020 </ul></div>
1021 </li>
1022 <li>
1024 </p>
1025 <div class="dlist"><dl>
1026 <dt class="hdlist1">
1027 TRAILER:
1028 </dt>
1029 <dd>
1031 Trailing checksum of the preceding contents.
1032 </p>
1033 </dd>
1034 </dl></div>
1035 </li>
1036 </ul></div>
1037 </div>
1038 </div>
1039 <div class="sect1">
1040 <h2 id="_appendix_a_serialization_format_for_an_ewah_bitmap">Appendix A: Serialization format for an EWAH bitmap</h2>
1041 <div class="sectionbody">
1042 <div class="paragraph"><p>Ewah bitmaps are serialized in the same protocol as the JAVAEWAH
1043 library, making them backwards compatible with the JGit
1044 implementation:</p></div>
1045 <div class="ulist"><ul>
1046 <li>
1048 4-byte number of bits of the resulting UNCOMPRESSED bitmap
1049 </p>
1050 </li>
1051 <li>
1053 4-byte number of words of the COMPRESSED bitmap, when stored
1054 </p>
1055 </li>
1056 <li>
1058 N x 8-byte words, as specified by the previous field
1059 </p>
1060 <div class="paragraph"><p>This is the actual content of the compressed bitmap.</p></div>
1061 </li>
1062 <li>
1064 4-byte position of the current RLW for the compressed
1065 bitmap
1066 </p>
1067 </li>
1068 </ul></div>
1069 <div class="paragraph"><p>All words are stored in network byte order for their corresponding
1070 sizes.</p></div>
1071 <div class="paragraph"><p>The compressed bitmap is stored in a form of run-length encoding, as
1072 follows. It consists of a concatenation of an arbitrary number of
1073 chunks. Each chunk consists of one or more 64-bit words</p></div>
1074 <div class="literalblock">
1075 <div class="content">
1076 <pre><code>H L_1 L_2 L_3 .... L_M</code></pre>
1077 </div></div>
1078 <div class="paragraph"><p>H is called RLW (run length word). It consists of (from lower to higher
1079 order bits):</p></div>
1080 <div class="ulist"><ul>
1081 <li>
1083 1 bit: the repeated bit B
1084 </p>
1085 </li>
1086 <li>
1088 32 bits: repetition count K (unsigned)
1089 </p>
1090 </li>
1091 <li>
1093 31 bits: literal word count M (unsigned)
1094 </p>
1095 </li>
1096 </ul></div>
1097 <div class="paragraph"><p>The bitstream represented by the above chunk is then:</p></div>
1098 <div class="ulist"><ul>
1099 <li>
1101 K repetitions of B
1102 </p>
1103 </li>
1104 <li>
1106 The bits stored in <code>L_1</code> through <code>L_M</code>. Within a word, bits at
1107 lower order come earlier in the stream than those at higher
1108 order.
1109 </p>
1110 </li>
1111 </ul></div>
1112 <div class="paragraph"><p>The next word after <code>L_M</code> (if any) must again be a RLW, for the next
1113 chunk. For efficient appending to the bitstream, the EWAH stores a
1114 pointer to the last RLW in the stream.</p></div>
1115 </div>
1116 </div>
1117 <div class="sect1">
1118 <h2 id="_appendix_b_optional_bitmap_sections">Appendix B: Optional Bitmap Sections</h2>
1119 <div class="sectionbody">
1120 <div class="paragraph"><p>These sections may or may not be present in the <code>.bitmap</code> file; their
1121 presence is indicated by the header flags section described above.</p></div>
1122 </div>
1123 </div>
1124 <div class="sect1">
1125 <h2 id="_name_hash_cache">Name-hash cache</h2>
1126 <div class="sectionbody">
1127 <div class="paragraph"><p>If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains
1128 a cache of 32-bit values, one per object in the pack/MIDX. The value at
1129 position <code>i</code> is the hash of the pathname at which the `i`th object
1130 (counting in index or multi-pack index order) in the pack/MIDX can be found.
1131 This can be fed into the delta heuristics to compare objects with similar
1132 pathnames.</p></div>
1133 <div class="paragraph"><p>The hash algorithm used is:</p></div>
1134 <div class="literalblock">
1135 <div class="content">
1136 <pre><code>hash = 0;
1137 while ((c = *name++))
1138 if (!isspace(c))
1139 hash = (hash &gt;&gt; 2) + (c &lt;&lt; 24);</code></pre>
1140 </div></div>
1141 <div class="paragraph"><p>Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag.
1142 If implementations want to choose a different hashing scheme, they are
1143 free to do so, but MUST allocate a new header flag (because comparing
1144 hashes made under two different schemes would be pointless).</p></div>
1145 </div>
1146 </div>
1147 <div class="sect1">
1148 <h2 id="_commit_lookup_table">Commit lookup table</h2>
1149 <div class="sectionbody">
1150 <div class="paragraph"><p>If the BITMAP_OPT_LOOKUP_TABLE flag is set, the last <code>N * (4 + 8 + 4)</code>
1151 bytes (preceding the name-hash cache and trailing hash) of the <code>.bitmap</code>
1152 file contains a lookup table specifying the information needed to get
1153 the desired bitmap from the entries without parsing previous unnecessary
1154 bitmaps.</p></div>
1155 <div class="paragraph"><p>For a <code>.bitmap</code> containing <code>nr_entries</code> reachability bitmaps, the table
1156 contains a list of <code>nr_entries</code> &lt;commit_pos, offset, xor_row&gt; triplets
1157 (sorted in the ascending order of <code>commit_pos</code>). The content of the i&#8217;th
1158 triplet is -</p></div>
1159 <div class="ulist"><ul>
1160 <li>
1162 </p>
1163 <div class="dlist"><dl>
1164 <dt class="hdlist1">
1165 commit_pos (4 byte integer, network byte order):
1166 </dt>
1167 <dd>
1169 It stores the object position of a commit (in the midx or pack
1170 index).
1171 </p>
1172 </dd>
1173 </dl></div>
1174 </li>
1175 <li>
1177 </p>
1178 <div class="dlist"><dl>
1179 <dt class="hdlist1">
1180 offset (8 byte integer, network byte order):
1181 </dt>
1182 <dd>
1184 The offset from which that commit&#8217;s bitmap can be read.
1185 </p>
1186 </dd>
1187 </dl></div>
1188 </li>
1189 <li>
1191 </p>
1192 <div class="dlist"><dl>
1193 <dt class="hdlist1">
1194 xor_row (4 byte integer, network byte order):
1195 </dt>
1196 <dd>
1198 The position of the triplet whose bitmap is used to compress
1199 this one, or <code>0xffffffff</code> if no such bitmap exists.
1200 </p>
1201 </dd>
1202 </dl></div>
1203 </li>
1204 </ul></div>
1205 </div>
1206 </div>
1207 </div>
1208 <div id="footnotes"><hr /></div>
1209 <div id="footer">
1210 <div id="footer-text">
1211 Last updated
1212 2023-10-24 06:43:46 JST
1213 </div>
1214 </div>
1215 </body>
1216 </html>