some updates
[iv.d.git] / sr3c.d
blob5c154862858008017a6e36873ab2ca325d12cd2c
1 /** SR3C, a symbol ranking data compressor.
3 * This file implements a fast and effective data compressor.
4 * The compression is on par (k8: i guess ;-) to gzip -7.
5 * bzip2 -2 compresses slightly better than SR3C, but takes almost
6 * three times as long. Furthermore, since bzip2 is based on
7 * Burrows-Wheeler block sorting, it can't be used in on-line
8 * compression tasks.
9 * Memory consumption of SR3C is currently around 4.5 MB per ongoing
10 * compression and decompression.
12 * Author: Kenneth Oksanen <cessu@iki.fi>, 2008.
13 * Copyright (C) Helsinki University of Technology.
14 * D conversion by Ketmar // Invisible Vector <ketmar@ketmar.no-ip.org>
16 * This code borrows many ideas and some paragraphs of comments from
17 * Matt Mahoney's s symbol ranking compression program SR2 and Peter
18 * Fenwicks SRANK, but otherwise all code has been implemented from
19 * scratch.
21 * This file is distributed under the following license:
23 * The MIT License
24 * Copyright (c) 2008 Helsinki University of Technology
25 * Permission is hereby granted, free of charge, to any person obtaining a copy
26 * of this software and associated documentation files (the "Software"), to deal
27 * in the Software without restriction, including without limitation the rights
28 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
29 * copies of the Software, and to permit persons to whom the Software is
30 * furnished to do so, subject to the following conditions:
31 * The above copyright notice and this permission notice shall be included in
32 * all copies or substantial portions of the Software.
33 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
34 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
35 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
36 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
37 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
38 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
39 * THE SOFTWARE.
41 module iv.sr3c /*is aliced*/;
42 import iv.alice;
44 /** Prior to compression or uncompression the user of this library
45 * creates a "compression context" of type `SR3C` which can
46 * with certain disciplines be used for both compression and
47 * uncompression. The compression context initialization is given a
48 * function `outdg` which the compression and uncompression
49 * routines call for processing the data further.
51 * This library is MT-safe so that any number of concurrent
52 * compression or uncompression tasks may be ongoing as long as each
53 * of them is passed a distinct compression context. By default each
54 * compression context requires approximately 4.5 MB of memory, but
55 * see the internal documentation of sr3c.d for further notes on this.
57 * Compression tasks are generally NOT synchronous in the sense that
58 * once some data is passed to `compress()` the `outdg` would
59 * be passed enough compressed data so as to uncompress all of the
60 * same original data. When synchronization is required, for example
61 * at the end of successful compression, the user of this library
62 * should call `flush()`. The decompressing side will recognize
63 * the flush from the compressed data. Only when a compression and
64 * decompression have flushed in this manner, can the corresponding
65 * compression contexts be used in reverse direction.
67 public final class SR3C {
68 public:
69 /** The type of the function used to process the compressed or
70 * uncompressed data further. The function should return a non-zero
71 * value on failure. The function may not free the memory referred to
72 * by the 'bytes' argument. The argument 'flush' is passed a true
73 * value iff `flush()` was called after compressing the data.
75 alias OutputDg = int delegate (const(void)[] bytes, bool flush);
77 private:
78 enum SR_HMASK = 0xFFFFF;
79 enum SR_HMULT = 480;
80 enum SR_XSHFT = 20;
81 /* It is possible to reduce memory consumption from 4.5 MB to, say,
82 1.5 MB with the following definitions, but this will also reduce
83 the compression ratio. In my tests on average this would result in
84 an almost 4% increase in the size of the compressed data.
85 #define SR_HMASK 0x3FFFF
86 #define SR_HMULT 352
87 #define SR_XSHFT 18
88 Even further memory savings are possible, but again with worse
89 compression. The following definitions require some 0.75 MB memory
90 and result 7-8% larger compressed data than with current default
91 definitions. Further memory savings would require changes in the
92 secondary arithmetic compression, a.k.a. state maps.
93 #define SR_HMASK 0xFFFF
94 #define SR_HMULT 288
95 #define SR_XSHFT 16
96 Conversely, by allowing a loftier memory budget, say 64.5 MB,
97 compression can be improved further. The following will result in
98 a little over 2% improvement in compression:
99 #define SR_HMASK 0xFFFFFF
100 #define SR_HMULT 704
101 #define SR_XSHFT 24
104 enum SM_P_HALF = 1U<<31;
106 enum SM_HUFF_LO_N = 4*256*3;
107 enum SM_HUFF_HI_N1 = 16;
108 enum SM_HUFF_HI_N2 = 3;
109 enum SM_BYTES_N = 256*256;
111 enum {
112 SR_FLUSHED,
113 SR_COMPRESSING,
114 /* The following states are for the uncompressor. */
115 SR_FILLING_1, SR_FILLING_2, SR_FILLING_3,
116 SR_UNCOMPRESSING_1, SR_UNCOMPRESSING_2, SR_UNCOMPRESSING_3,
117 SR_UNCOMPRESSING_BYTE, SR_FLUSHING,
120 ubyte* outbuf_p; // current output buffer
121 uint[SR_HMASK+1] rank; // symbol ranking tables
122 uint hash; // hash value of the current compression context
123 int prev_ch; // previous byte we encoded, 0 if none
124 /* statemaps map secondary context to probability.
125 * each statemap entry contains the prediction in the upper 25 bits and a count of
126 * how many times this state has been reached in the lower 7 bits. */
127 // states for encoding the first three bits in each context; some of the entries in sm_huff_hi are unused
128 uint[SM_HUFF_LO_N] sm_huff_lo;
129 uint[SM_HUFF_HI_N2][SM_HUFF_HI_N1] sm_huff_hi;
130 // states for encoding a byte not predicted by symbol ranker using a context-1 arithmetic encoding
131 uint[SM_BYTES_N] sm_bytes;
132 // arithmetic coder range, initially [0, 1), scaled by 2^32; the field x is use only during uncompression
133 uint x1, x2, x;
134 // iutput function and its context, returns non-zero on error
135 OutputDg output_f;
136 int status; // SR_XXX
137 // the following field is used by the uncompressor while uncompressing literal bytes
138 int lit_ix;
140 public:
141 this (OutputDg outdg) { reset(outdg); }
143 private:
144 /** Reinitialize the given compression context. If `outdg` is
145 * non-null, it is assigned to the compression context's new output
146 * function. If, on the other hand, output_f is `null` the callback
147 * function remain as it is.
149 public void reset (OutputDg outdg=null) {
150 int i, j;
151 /* Initialize statemaps. The initial values were chosen based on a
152 large number of test runs. The cumulative statistic match quite
153 well those of Fenwick:
154 - Least recently used matches with 45% probability (over all counts),
155 - Second least recently used matches with 15% probability,
156 - Third least recently used matches with 7% probability,
157 - Literals match with 32% probability.
158 Initializing to anything else but SM_P_HALF produces
159 proportionally modest benefits for large inputs, but we wanted
160 SR3C to be effective also for small inputs. */
161 for (i = 0; i < 3*256; i += 3) {
162 sm_huff_lo.ptr[i] = (3400<<20)|0;
163 sm_huff_lo.ptr[i+1] = (150<<20)|12;
164 sm_huff_lo.ptr[i+2] = (1000<<20)|12;
166 for (; i < 2*3*256; i += 3) {
167 sm_huff_lo.ptr[i] = (1500<<20)|0;
168 sm_huff_lo.ptr[i+1] = (1840<<20)|12;
169 sm_huff_lo.ptr[i+2] = (780<<20)|12;
171 for (; i < 3*3*256; i += 3) {
172 sm_huff_lo.ptr[i] = (880<<20)|0;
173 sm_huff_lo.ptr[i+1] = (1840<<20)|12;
174 sm_huff_lo.ptr[i+2] = (760<<20)|12;
176 for (; i < 4*3*256; i += 3) {
177 sm_huff_lo.ptr[i] = (780<<20)|0;
178 sm_huff_lo.ptr[i+1] = (1840<<20)|12;
179 sm_huff_lo.ptr[i+2] = (1120<<20)|12;
181 for (i = 0; i < SM_HUFF_HI_N1; i++) {
182 sm_huff_hi.ptr[i].ptr[0] = (400<<20)|10;
183 sm_huff_hi.ptr[i].ptr[1] = (1840<<20)|12;
184 sm_huff_hi.ptr[i].ptr[2] = (1180<<20)|12;
186 for (i = 0; i < SM_BYTES_N; i++)
187 sm_bytes.ptr[i] = SM_P_HALF;
189 for (i = 0; i < SR_HMASK+1; i++)
190 rank.ptr[i] = 0;
192 prev_ch = 0;
193 hash = 0;
195 x1 = 0;
196 x2 = 0xFEFFFFFF;
197 if (outdg !is null) output_f = outdg;
198 status = SR_FLUSHED;
201 // what is the prediction, as a fractional probability from 0 to 4095, of the next bit being one
202 enum SM_PREDICT(string sm) = "(("~sm~")>>20)";
203 enum SM_COUNT(string sm) = "(("~sm~")&0x1FF)";
206 enum SM_UPDATE_RANKS(string sm, string bit) = "do {\n"~
207 //"assert(bit == 0 || bit == 1);\n"~
208 "int count = "~sm~"&0x1FF, prediction__ = "~sm~">>9;\n"~
209 "int d = ("~bit~"<<23)-prediction__;\n"~
210 ""~sm~" += (d*sr_wt_ranks.ptr[count])&0xFFFFFE00;\n"~
211 "if (d < 0) d = -d;\n"~
212 "d >>= 17;\n"~
213 "//"~sm~" += sr_ct_ranks.ptr[d].ptr[count];\n"~
214 ""~sm~" ^= sr_ct_ranks.ptr[d].ptr[count];\n"~
215 "} while (0);";
217 enum SM_UPDATE_BYTES(string sm, string bit) = "do {\n"~
218 //"assert(bit == 0 || bit == 1);\n"~
219 "int count = "~sm~"&0x1FF, prediction__ = "~sm~">>9;\n"~
220 "int d = ("~bit~"<<23)-prediction__;\n"~
221 ""~sm~" += (d*sr_wt_bytes.ptr[count])&0xFFFFFE00;\n"~
222 "if (d < 0) d = -d;\n"~
223 "d >>= 17;\n"~
224 ""~sm~" ^= sr_ct_bytes.ptr[d].ptr[count];\n"~
225 "} while (0);";
227 // compress bit in the given state map, possibly outputting bytes in process
228 enum SR_ENCODE_RANK_BIT(string sm, string bit) = "do {\n"~
229 "uint xmid;\n"~
230 "int prediction_ = "~SM_PREDICT!sm~";\n"~
231 //"assert(bit == 0 || bit == 1);\n"~
232 "assert(prediction_ >= 0 && prediction_ < 4096);\n"~
233 "xmid = x1+((x2-x1)>>12)*prediction_;\n"~
234 "assert(xmid >= x1 && xmid < x2);\n"~
235 "if ("~bit~") x2 = xmid; else x1 = xmid+1;\n"~
236 SM_UPDATE_RANKS!(sm, bit)~"\n"~
237 "/* pass equal leading bytes of range */\n"~
238 "while ((x1>>24) == (x2>>24)) {\n"~
239 " *outbuf_p++ = x2>>24;\n"~
240 " x1 <<= 8;\n"~
241 " x2 = (x2<<8)+255;\n"~
242 "}\n"~
243 "} while (0);";
245 enum SR_ENCODE_BYTE_BIT(string sm, string bit) = "do {\n"~
246 "uint xmid;\n"~
247 "int prediction_ = "~SM_PREDICT!sm~";\n"~
248 //"assert(bit == 0 || bit == 1);\n"~
249 "assert(prediction_ >= 0 && prediction_ < 4096);\n"~
250 "xmid = x1+((x2-x1)>>12)*prediction_;\n"~
251 "assert(xmid >= x1 && xmid < x2);\n"~
252 "if ("~bit~") x2 = xmid; else x1 = xmid+1;\n"~
253 SM_UPDATE_BYTES!(sm, bit)~"\n"~
254 "// pass equal leading bytes of range\n"~
255 "while ((x1>>24) == (x2>>24)) {\n"~
256 " *outbuf_p++ = x2>>24;\n"~
257 " x1 <<= 8;\n"~
258 " x2 = (x2<<8)+255;\n"~
259 "}\n"~
260 "} while (0);";
262 enum SR_ENCODE_BYTE(string ch) = "do {\n"~
263 "uint* sm_ = &sm_bytes.ptr[256*prev_ch];\n"~
264 "uint ix = 1, x = "~ch~", bit;\n"~
265 "do {\n"~
266 " bit = (x>>7)&0x1;\n"~
267 " "~SR_ENCODE_BYTE_BIT!("sm_[ix]", "bit")~"\n"~
268 " ix = (ix<<1)|bit;\n"~
269 " x <<= 1;\n"~
270 "} while (ix < 256);\n"~
271 "} while (0);";
274 enum OUTBUF_SIZE = 1024;
276 /** Compress the given bytes, possibly sending some compressed data to
277 * `outdg`. Returns zero on success, or whatever `outdg` returned
278 * should it have erred. Once the context has been used for
279 * compressing, it can't be used for uncompressing before `flush()`
280 * has been called.
282 public int compress (const(void)[] data) {
283 const(ubyte)* bytes = cast(const(ubyte)*)data.ptr;
284 usize n_bytes = data.length;
285 uint index, xsum, r;
286 uint* sm;
287 int ch;
288 ubyte[OUTBUF_SIZE] outbuf;
289 /* The theoretical worst case is that each bit is encoded into 12
290 bits, and there can be 10 bits of output per bit sent to the
291 arithmetic coder, or 15 bytes. Additionally there may be some
292 bytes of flushing from the arithmetic coder itself, thus a margin
293 of slightly cautious 30 bytes. */
294 ubyte* outbuf_end = outbuf.ptr+OUTBUF_SIZE-30;
296 assert(status == SR_FLUSHED || status == SR_COMPRESSING);
298 status = SR_COMPRESSING;
300 while (n_bytes > 0) {
301 outbuf_p = outbuf.ptr;
302 while (outbuf_p < outbuf_end && n_bytes > 0) {
303 --n_bytes;
304 index = hash&SR_HMASK;
305 xsum = (hash>>SR_XSHFT)&0xF;
306 r = rank.ptr[index];
307 if (((r>>24)&0xF) != xsum) {
308 // hash collision: pick another index, use it if checksums match or it has a lower first-hit counter value
309 int alt_index = (index+0x3A77)&SR_HMASK;
310 uint alt_r = rank.ptr[alt_index];
311 if (((alt_r>>24)&0xF) == xsum || alt_r < r) {
312 index = alt_index;
313 r = alt_r;
316 if (r >= 0x40000000)
317 sm = &sm_huff_hi.ptr[r>>28].ptr[0];
318 else
319 sm = &sm_huff_lo.ptr[3*(prev_ch|((r>>20)&0x300))];
321 ch = *bytes++;
322 xsum = (xsum<<24)|ch;
323 if (ch == (r&0xFF)) {
324 // is ch the least recently seen?
325 mixin(SR_ENCODE_RANK_BIT!("sm[0]", "0"));
326 if (r < 0xF0000000) r += 0x10000000; // increment hit count
327 } else {
328 mixin(SR_ENCODE_RANK_BIT!("sm[0]", "1"));
329 if (ch == ((r>>8)&0xFF)) {
330 // is ch the second least recent?
331 mixin(SR_ENCODE_RANK_BIT!("sm[1]", "1"));
332 mixin(SR_ENCODE_RANK_BIT!("sm[2]", "0"));
333 if ((r>>28) >= 0xC)
334 r &= 0x4FFFFFFF;
335 else
336 r = (r&0xFF0000)|((r&0xFF)<<8)|xsum|0x10000000;
337 } else if (ch == ((r>>16)&0xFF)) {
338 // is ch the third least recent?
339 mixin(SR_ENCODE_RANK_BIT!("sm[1]", "1"));
340 mixin(SR_ENCODE_RANK_BIT!("sm[2]", "1"));
341 r = ((r&0xFFFF)<<8)|xsum|0x10000000;
342 } else {
343 mixin(SR_ENCODE_RANK_BIT!("sm[1]", "0"));
344 mixin(SR_ENCODE_BYTE!"ch");
345 r = ((r&0xFFFF)<<8)|xsum;
348 rank.ptr[index] = r;
349 prev_ch = ch;
350 hash = SR_HMULT*hash+ch+1;
352 if (outbuf_p > outbuf.ptr) {
353 if (int err = output_f(outbuf[0..outbuf_p-outbuf.ptr], false)) return err;
356 return 0;
359 /** Flush the internal buffered state of the compression context to
360 * `outdg` so that a uncompression of all the data provided prior
361 * to the flush becomes possible. Returns zero on success, or
362 * whatever `outdg` returned should it have erred. It is possible
363 * to continue compression after flushing the buffers, but each flush
364 * will cause at least a three byte overhead, probably higher.
365 * File compressors and archivers typically flush only at the end of
366 * compression, but message-interchanging programs flush whenever
367 * real-timeness requires or a response is required to proceed.
369 * Note that `flush()` may not be called for a context currently used
370 * for uncompression.
372 public int flush () {
373 uint r;
374 uint *sm;
375 int index, xsum, i, err;
376 ubyte[128] outbuf;
377 outbuf_p = &outbuf[0];
379 assert(status == SR_COMPRESSING || status == SR_FLUSHED);
381 /* Pick state map entry as if compressing a normal data byte. */
382 index = hash&SR_HMASK;
383 xsum = (hash>>SR_XSHFT)&0xF;
384 r = rank.ptr[index];
385 if (((r>>24)&0xF) != xsum) {
386 int alt_index = (index+0x3A77)&SR_HMASK;
387 uint alt_r = rank.ptr[alt_index];
388 if (((alt_r>>24)&0xF) == xsum || alt_r < r) {
389 index = alt_index;
390 r = alt_r;
393 if (r >= 0x40000000)
394 sm = &sm_huff_hi.ptr[r>>28].ptr[0];
395 else
396 sm = &sm_huff_lo.ptr[3*(prev_ch|((r>>20)&0x300))];
398 /* Mark end of data by coding third least recently used byte as
399 a literal. */
400 mixin(SR_ENCODE_RANK_BIT!("sm[0]", "1"));
401 mixin(SR_ENCODE_RANK_BIT!("sm[1]", "0"));
402 mixin(SR_ENCODE_BYTE!"((r>>16)&0xFF)");
404 /* Flush also the arithmetic encoder, first by the first unequal
405 byte in the range and thereafter three maximum bytes. */
406 *outbuf_p++ = x1>>24;
407 *outbuf_p++ = 0xFF;
408 *outbuf_p++ = 0xFF;
409 *outbuf_p++ = 0xFF;
411 /* Finally send this all out. */
412 err = output_f(outbuf[0..outbuf_p-outbuf.ptr], true);
413 if (err) return err;
415 /* Reset internal values in the context, not however the statemaps or ranks. */
416 prev_ch = 0;
417 hash = 0;
418 x1 = 0;
419 x2 = 0xFEFFFFFF;
420 status = SR_FLUSHED;
422 return 0;
425 /** Return true if the given context is flushed, i.e. if the
426 * context can now be used for either compression or uncompression.
428 public @property bool flushed () const pure nothrow @safe @nogc { return (status == SR_FLUSHED); }
430 /** Uncompress the given bytes, possibly sending some uncompressed data
431 * to `outdg`. Returns zero on success, or whatever `outdg`
432 * returned should it have failed. SR3C includes no checksum so
433 * corrupted compressed messages will not be detected. Once the
434 * context has been used for uncompressing, it can't be used for
435 * compressing before `flush()` has been issued on the corresponding
436 * compressing side and the resulting compressed data has been
437 * uncompressed.
439 public int uncompress (const(void)[] data) {
440 enum SR_INPUT(string state_name) =
441 "do {\n"~
442 ""~state_name~":\n"~
443 " while ((x1>>24) == (x2>>24)) {\n"~
444 " if (n_bytes == 0) { status = "~state_name~"; goto out_of_input; }\n"~
445 " x1 <<= 8;\n"~
446 " x2 = (x2<<8)+255;\n"~
447 " x = (x<<8)+*bytes++;\n"~
448 " n_bytes--;\n"~
449 " }\n"~
450 "} while (0);";
452 enum SR_DECODE_BIT(string sm, string bit, string state_name) =
453 "do {\n"~
454 " "~SR_INPUT!(state_name)~"\n"~
455 " prediction = "~SM_PREDICT!(sm)~";\n"~
456 " assert(prediction >= 0 && prediction < 4096);\n"~
457 " xmid = x1+((x2-x1)>>12)*prediction;\n"~
458 " assert(xmid >= x1 && xmid < x2);\n"~
459 " if (x <= xmid) {\n"~
460 " "~bit~" = 1;\n"~
461 " x2 = xmid;\n"~
462 " } else {\n"~
463 " "~bit~" = 0;\n"~
464 " x1 = xmid+1;\n"~
465 " }\n"~
466 "} while (0);";
468 const(ubyte)* bytes = cast(const(ubyte)*)data.ptr;
469 usize n_bytes = data.length;
470 uint xmid;
471 int prediction;
473 uint index, xsum, r;
474 uint* sm;
475 int bit, ch, err;
476 ubyte[OUTBUF_SIZE] outbuf;
477 outbuf_p = outbuf.ptr;
478 ubyte* outbuf_end = outbuf.ptr+OUTBUF_SIZE;
480 assert(status != SR_COMPRESSING);
482 switch (status) {
483 case SR_FLUSHED:
484 while (n_bytes > 0 && *bytes == 0xFF) {
485 bytes++;
486 n_bytes--;
488 if (n_bytes-- == 0) return 0;
489 restart:
490 x = (x<<8)|*bytes++;
491 goto case;
492 case SR_FILLING_1:
493 if (n_bytes-- == 0) {
494 status = SR_FILLING_1;
495 return 0;
497 x = (x<<8)|*bytes++;
498 goto case;
499 case SR_FILLING_2:
500 if (n_bytes-- == 0) {
501 status = SR_FILLING_2;
502 return 0;
504 x = (x<<8)|*bytes++;
505 goto case;
506 case SR_FILLING_3:
507 if (n_bytes-- == 0) {
508 status = SR_FILLING_3;
509 return 0;
511 x = (x<<8)|*bytes++;
512 status = SR_UNCOMPRESSING_1;
513 break;
514 case SR_FLUSHING:
515 goto SR_FLUSHING;
516 default:
517 // the default branch is here to only to keep the compiler happy
518 break;
521 index = hash&SR_HMASK;
522 xsum = (hash>>SR_XSHFT)&0xF;
523 r = rank.ptr[index];
524 if (((r>>24)&0xF) != xsum) {
525 // hash collision: pick another index, use it if checksums match or it has a lower first-hit counter value
526 int alt_index = (index+0x3A77)&SR_HMASK;
527 uint alt_r = rank.ptr[alt_index];
528 if (((alt_r>>24)&0xF) == xsum || alt_r < r) {
529 index = alt_index;
530 r = alt_r;
533 if (r >= 0x40000000)
534 sm = &sm_huff_hi.ptr[r>>28].ptr[0];
535 else
536 sm = &sm_huff_lo.ptr[3*(prev_ch|((r>>20)&0x300))];
537 xsum <<= 24;
539 switch (status) {
540 case SR_UNCOMPRESSING_1: goto SR_UNCOMPRESSING_1;
541 case SR_UNCOMPRESSING_2: goto SR_UNCOMPRESSING_2;
542 case SR_UNCOMPRESSING_3: goto SR_UNCOMPRESSING_3;
543 case SR_UNCOMPRESSING_BYTE: sm = &sm_bytes.ptr[256*prev_ch]; goto SR_UNCOMPRESSING_BYTE;
544 default: assert(0);
547 for (;;) {
548 index = hash&SR_HMASK;
549 xsum = (hash>>SR_XSHFT)&0xF;
550 r = rank.ptr[index];
551 if (((r>>24)&0xF) != xsum) {
552 // hash collision: pick another index, use it if checksums match or it has a lower first-hit counter value
553 int alt_index = (index+0x3A77)&SR_HMASK;
554 uint alt_r = rank.ptr[alt_index];
555 if (((alt_r>>24)&0xF) == xsum || alt_r < r) {
556 index = alt_index;
557 r = alt_r;
560 if (r >= 0x40000000)
561 sm = &sm_huff_hi.ptr[r>>28].ptr[0];
562 else
563 sm = &sm_huff_lo.ptr[3*(prev_ch|((r>>20)&0x300))];
564 xsum <<= 24;
566 mixin(SR_DECODE_BIT!("sm[0]", "bit", "SR_UNCOMPRESSING_1"));
567 mixin(SM_UPDATE_RANKS!("sm[0]", "bit"));
568 if (bit) {
569 mixin(SR_DECODE_BIT!("sm[1]", "bit", "SR_UNCOMPRESSING_2"));
570 mixin(SM_UPDATE_RANKS!("sm[1]", "bit"));
571 if (bit) {
572 mixin(SR_DECODE_BIT!("sm[2]", "bit", "SR_UNCOMPRESSING_3"));
573 mixin(SM_UPDATE_RANKS!("sm[2]", "bit"));
574 if (bit) {
575 // third least recent byte
576 ch = (r>>16)&0xFF;
577 r = ((r&0xFFFF)<<8)|ch|xsum|0x10000000;
578 } else {
579 // second least recent byte
580 ch = (r>>8)&0xFF;
581 if ((r>>28) >= 0xC)
582 r &= 0x4FFFFFFF;
583 else
584 r = (r&0xFF0000)|((r&0xFF)<<8)|ch|xsum|0x10000000;
586 } else {
587 sm = &sm_bytes.ptr[256*prev_ch];
588 lit_ix = 1;
589 do {
590 mixin(SR_DECODE_BIT!("sm[lit_ix]", "bit", "SR_UNCOMPRESSING_BYTE"));
591 mixin(SM_UPDATE_BYTES!("sm[lit_ix]", "bit"));
592 lit_ix = (lit_ix<<1)|bit;
593 } while (lit_ix < 256);
594 ch = lit_ix&0xFF;
595 if (ch == ((r>>16)&0xFF)) goto flush;
596 r = ((r&0xFFFF)<<8)|ch|xsum;
598 } else {
599 // least recent byte
600 ch = r&0xFF;
601 if (r < 0xF0000000) r += 0x10000000;
604 *outbuf_p++ = cast(ubyte)ch;
605 rank.ptr[index] = r;
606 prev_ch = ch;
607 hash = SR_HMULT*hash+ch+1;
608 if (outbuf_p == outbuf_end) {
609 err = output_f(outbuf[0..OUTBUF_SIZE], false);
610 if (err) return err;
611 outbuf_p = outbuf.ptr;
615 flush:
616 // we come here when we have received a flush.
617 // pass the flush-induced bytes in the data stream and reset internal values
618 // in the context, not however the statemaps or ranks.
619 mixin(SR_INPUT!"SR_FLUSHING");
620 prev_ch = 0;
621 hash = 0;
622 x1 = 0;
623 x2 = 0xFEFFFFFF;
624 status = SR_FLUSHED;
625 /* Skip 0xFF-bytes. */
626 while (n_bytes > 0 && *bytes == 0xFF) {
627 bytes++;
628 n_bytes--;
630 err = output_f(outbuf[0..outbuf_p-outbuf.ptr], true);
631 if (err) return err;
632 outbuf_p = outbuf.ptr;
633 if (n_bytes-- > 0) {
634 outbuf_p = outbuf.ptr;
635 goto restart;
637 return 0;
639 out_of_input:
640 if (outbuf_p != outbuf.ptr) return output_f(outbuf[0..outbuf_p-outbuf.ptr], false);
641 return 0;
644 static:
645 // some precomputed tables on how the secondary arithmetical compressor adjusts to actual data; see comments later
646 static immutable ubyte[512] sr_wt_bytes;
647 static immutable ubyte[512] sr_wt_ranks;
648 static immutable ushort[512][65] sr_ct_bytes;
649 static immutable ushort[512][65] sr_ct_ranks;
651 shared static this () pure nothrow @safe @nogc {
652 // initialize the various precomputed tables
653 foreach (int count; 0..512) {
654 // a table indexed by current state map count and returning the corresponding responsiveness to unpredicted input
655 // found with numeric optimization for the order-0 arithmetic compressor
656 sr_wt_bytes[count] = cast(ubyte)(2.814086+(1.107489*count+639.588922)/(0.006940*count*count+count+6.318012));
657 // as above, but for rank encoding data
658 sr_wt_ranks[count] = cast(ubyte)(-1.311630+(0.616477*count+640.391038)/(0.001946*count*count+count+5.632143));
659 foreach (int d; 0..64+1) {
660 // a table for updating the count-part in statemaps for bytes
661 int c = (d > 1.1898325 && count > 19.782085 ? cast(int)(-2+0.021466*(d-1.1898325)*(count-19.782085)) : -2);
662 if (c > count) c = 0; else if (count-c > 0x1FF) c = 0x1FF; else c = count-c;
663 sr_ct_bytes[d][count] = cast(ushort)(c^count);
664 // a table for updating the count-part in statemaps for ranks
665 c = (count > 33.861341 ? cast(int)(-2+0.005355*(d+0.981405)*(count-33.861341)) : -2);
666 if (c > count) c = 0; else if (count-c > 0x1FF) c = 0x1FF; else c = count-c;
667 sr_ct_ranks[d][count] = cast(ushort)(c^count);
674 /* This code is based on many of the ideas in Matt Mahoney's SR2
675 symbol ranking compression program
676 http://www.cs.fit.edu/~mmahoney/compression/#sr2
677 which in turn is based on SRANK by P. M. Fenwick in 1997-98
678 ftp://ftp.cs.auckland.ac.nz/pub/staff/peter-f/srank.c
679 See also Fenwick's technical report "A fast, constant-order, symbol
680 ranking text compressor", The University of Auckland, Department of
681 Computer Science Report No 145, Apr. 1997.
683 A symbol ranking compressor maintains a list (a move-to-front
684 queue) of the most recently seen symbols (bytes) in the given
685 context in order of time since last seen. The original SRANK used
686 a table of 2^10 to 2^18 hashed order-3 contexts, SR2 uses 2^20
687 hashed order-4 contexts, and some versions of SR3 use 2^24
688 contexts, which we considered excessively much in terms of implied
689 memory consumption. All the mentioned programs use a queue length
690 of 3, SR2 furthermore uses a 6 bit count (n) of consecutive hits,
691 SR3C uses only a 4 bit count but employs a 4-bit checksum to reduce
692 hash collisions.
694 SR2 as well as SR3C follow Fenwick's suggestion for a hardware
695 implementation in sending 0 for literals, 110 and 111 for the
696 second and third least recently seen, and 10xxxxxxxx's are reserved
697 for literals. These are compressed further with arithmetic coding
698 using both an order-1 context (last coded byte) and the count as
699 context, or order-0 and count if the count is greater or equal to
700 four.
702 Codes and updates are as follows:
704 Input Code Next state
705 (c1 c2 c3 n)
706 ----- ---- ---------------
707 Initial (0, 0, 0, 0)
708 c1 0 (c1, c2, c3, min(n+1, 15))
709 c2 110 (c2, c1, c3, 1)
710 c3 111 (c3, c1, c2, 1)
711 other c 10cccccccc (c, c1, c2, 0)
713 As an exception, however, in SR3C if input is c2 and n has reached
714 very high counts (12 or more), then the count is reduced to four
715 but the queue is kept intact.
717 After coding byte c, the hash index h is updated to h * 480 + c + 1
718 (mod 2^20) which depends on only the last 4 bytes. SR2 did not
719 detect hash collisions, but SR3C uses a 4-bit checksum which is
720 formed by the next four higher bits of the hash. The values are
721 packed into a 32 bit integer as follows: c1 in bits 0-7, c2 in
722 8-15, c3 in 16-23, checksum in 24-27, n in 28-31.
724 End of file is marked by coding c3 as a literal (SR2 used c1)
725 followed by three 0xFF's. The compressor library adds no header,
726 but ensures the message doesn't begin with 0xFF so as to enable
727 arbitrary catenation of messages. Additional headers and checksums
728 may be added by a stand-alone archiver using this library.
730 Arithmetic coding is performed as in SR2, and we advise the reader
731 to its documentation. Let it only be noted that compared to SR2,
732 SR3C uses far fewer arithmetic coding states than SR2 thus saving
733 ~1.5 MB of RAM as well as incidentally also a slightly better
734 compression ratio. There are also several other differences in
735 between SR2 and SR3C in how the predictions for next bits are
736 updated.