release/src-rt-6.x.4708/tools/misc/xz/doc/man/txt/xz.txt

   1 XZ(1)                              XZ Utils                              XZ(1)
   2
   3
   4
   5 NAME
   6        xz,  unxz,  xzcat, lzma, unlzma, lzcat - Compress or decompress .xz and
   7        .lzma files
   8
   9 SYNOPSIS
  10        xz [option]...  [file]...
  11
  12        unxz is equivalent to xz --decompress.
  13        xzcat is equivalent to xz --decompress --stdout.
  14        lzma is equivalent to xz --format=lzma.
  15        unlzma is equivalent to xz --format=lzma --decompress.
  16        lzcat is equivalent to xz --format=lzma --decompress --stdout.
  17
  18        When writing scripts that need to decompress files, it  is  recommended
  19        to  always use the name xz with appropriate arguments (xz -d or xz -dc)
  20        instead of the names unxz and xzcat.
  21
  22 DESCRIPTION
  23        xz is a general-purpose data compression tool with command line  syntax
  24        similar  to  gzip(1)  and  bzip2(1).  The native file format is the .xz
  25        format, but the legacy .lzma format used by LZMA  Utils  and  raw  com-
  26        pressed streams with no container format headers are also supported.
  27
  28        xz compresses or decompresses each file according to the selected oper-
  29        ation mode.  If no files are given or file is -, xz reads from standard
  30        input and writes the processed data to standard output.  xz will refuse
  31        (display an error and skip the file) to write compressed data to  stan-
  32        dard  output  if  it  is a terminal.  Similarly, xz will refuse to read
  33        compressed data from standard input if it is a terminal.
  34
  35        Unless --stdout is specified, files other than - are written to  a  new
  36        file whose name is derived from the source file name:
  37
  38        o  When  compressing,  the  suffix  of  the  target file format (.xz or
  39           .lzma) is appended to the source filename to get  the  target  file-
  40           name.
  41
  42        o  When  decompressing,  the  .xz  or  .lzma suffix is removed from the
  43           filename to get the target filename.  xz also  recognizes  the  suf-
  44           fixes .txz and .tlz, and replaces them with the .tar suffix.
  45
  46        If  the  target file already exists, an error is displayed and the file
  47        is skipped.
  48
  49        Unless writing to standard output, xz will display a warning  and  skip
  50        the file if any of the following applies:
  51
  52        o  File  is  not  a regular file.  Symbolic links are not followed, and
  53           thus they are not considered to be regular files.
  54
  55        o  File has more than one hard link.
  56
  57        o  File has setuid, setgid, or sticky bit set.
  58
  59        o  The operation mode is set to compress and the  file  already  has  a
  60           suffix  of  the  target file format (.xz or .txz when compressing to
  61           the .xz format, and .lzma or .tlz when compressing to the .lzma for-
  62           mat).
  63
  64        o  The  operation mode is set to decompress and the file doesn't have a
  65           suffix of any of the supported file formats (.xz,  .txz,  .lzma,  or
  66           .tlz).
  67
  68        After successfully compressing or decompressing the file, xz copies the
  69        owner, group, permissions, access time, and modification time from  the
  70        source  file  to the target file.  If copying the group fails, the per-
  71        missions are modified so that the target file doesn't become accessible
  72        to  users  who  didn't  have  permission to access the source file.  xz
  73        doesn't support copying other metadata like  access  control  lists  or
  74        extended attributes yet.
  75
  76        Once  the  target file has been successfully closed, the source file is
  77        removed unless --keep was specified.  The source file is never  removed
  78        if the output is written to standard output.
  79
  80        Sending  SIGINFO  or  SIGUSR1 to the xz process makes it print progress
  81        information to standard error.  This has only limited  use  since  when
  82        standard error is a terminal, using --verbose will display an automati-
  83        cally updating progress indicator.
  84
  85    Memory usage
  86        The memory usage of xz varies from a few hundred kilobytes  to  several
  87        gigabytes  depending  on  the  compression settings.  The settings used
  88        when compressing a file determine the memory requirements of the decom-
  89        pressor.  Typically the decompressor needs 5 % to 20 % of the amount of
  90        memory that the compressor needed when creating the file.  For example,
  91        decompressing  a  file  created with xz -9 currently requires 65 MiB of
  92        memory.  Still, it is possible to have .xz files that  require  several
  93        gigabytes of memory to decompress.
  94
  95        Especially  users  of  older  systems  may find the possibility of very
  96        large memory usage annoying.  To prevent  uncomfortable  surprises,  xz
  97        has  a  built-in  memory  usage  limiter, which is disabled by default.
  98        While some operating systems provide ways to limit the memory usage  of
  99        processes,  relying  on  it  wasn't  deemed to be flexible enough (e.g.
 100        using ulimit(1) to limit virtual memory tends to cripple mmap(2)).
 101
 102        The memory usage limiter can be enabled with the  command  line  option
 103        --memlimit=limit.  Often it is more convenient to enable the limiter by
 104        default  by  setting  the  environment   variable   XZ_DEFAULTS,   e.g.
 105        XZ_DEFAULTS=--memlimit=150MiB.   It is possible to set the limits sepa-
 106        rately for  compression  and  decompression  by  using  --memlimit-com-
 107        press=limit  and  --memlimit-decompress=limit.  Using these two options
 108        outside XZ_DEFAULTS is rarely useful because a single run of xz  cannot
 109        do  both  compression  and  decompression  and  --memlimit=limit (or -M
 110        limit) is shorter to type on the command line.
 111
 112        If the specified memory usage limit is exceeded when decompressing,  xz
 113        will  display  an  error  and decompressing the file will fail.  If the
 114        limit is exceeded when compressing, xz will try to scale  the  settings
 115        down  so that the limit is no longer exceeded (except when using --for-
 116        mat=raw or --no-adjust).  This way the operation won't fail unless  the
 117        limit is very small.  The scaling of the settings is done in steps that
 118        don't match the compression level presets, e.g. if the  limit  is  only
 119        slightly  less than the amount required for xz -9, the settings will be
 120        scaled down only a little, not all the way down to xz -8.
 121
 122    Concatenation and padding with .xz files
 123        It is possible to concatenate .xz files as is.  xz will decompress such
 124        files as if they were a single .xz file.
 125
 126        It  is  possible  to  insert  padding between the concatenated parts or
 127        after the last part.  The padding must consist of null  bytes  and  the
 128        size of the padding must be a multiple of four bytes.  This can be use-
 129        ful e.g. if the .xz file is stored on a medium that measures file sizes
 130        in 512-byte blocks.
 131
 132        Concatenation  and  padding  are  not  allowed  with .lzma files or raw
 133        streams.
 134
 135 OPTIONS
 136    Integer suffixes and special values
 137        In most places where an integer argument is expected, an optional  suf-
 138        fix  is  supported to easily indicate large integers.  There must be no
 139        space between the integer and the suffix.
 140
 141        KiB    Multiply the integer by 1,024 (2^10).  Ki, k, kB, K, and KB  are
 142               accepted as synonyms for KiB.
 143
 144        MiB    Multiply  the integer by 1,048,576 (2^20).  Mi, m, M, and MB are
 145               accepted as synonyms for MiB.
 146
 147        GiB    Multiply the integer by 1,073,741,824 (2^30).  Gi, g, G, and  GB
 148               are accepted as synonyms for GiB.
 149
 150        The special value max can be used to indicate the maximum integer value
 151        supported by the option.
 152
 153    Operation mode
 154        If multiple operation mode  options  are  given,  the  last  one  takes
 155        effect.
 156
 157        -z, --compress
 158               Compress.   This is the default operation mode when no operation
 159               mode option is specified and no other operation mode is  implied
 160               from the command name (for example, unxz implies --decompress).
 161
 162        -d, --decompress, --uncompress
 163               Decompress.
 164
 165        -t, --test
 166               Test  the integrity of compressed files.  This option is equiva-
 167               lent to --decompress --stdout except that the decompressed  data
 168               is  discarded  instead  of being written to standard output.  No
 169               files are created or removed.
 170
 171        -l, --list
 172               Print information about compressed files.  No uncompressed  out-
 173               put  is  produced, and no files are created or removed.  In list
 174               mode, the program cannot read the compressed data from  standard
 175               input or from other unseekable sources.
 176
 177               The  default  listing  shows  basic information about files, one
 178               file per line.  To get more detailed information, use  also  the
 179               --verbose  option.   For  even  more  information, use --verbose
 180               twice, but note that this may be slow, because getting  all  the
 181               extra  information  requires  many  seeks.  The width of verbose
 182               output exceeds 80 characters,  so  piping  the  output  to  e.g.
 183               less -S may be convenient if the terminal isn't wide enough.
 184
 185               The  exact  output  may  vary  between xz versions and different
 186               locales.  For machine-readable output, --robot --list should  be
 187               used.
 188
 189    Operation modifiers
 190        -k, --keep
 191               Don't delete the input files.
 192
 193        -f, --force
 194               This option has several effects:
 195
 196               o  If the target file already exists, delete it before compress-
 197                  ing or decompressing.
 198
 199               o  Compress or decompress even if the input is a  symbolic  link
 200                  to  a  regular  file, has more than one hard link, or has the
 201                  setuid, setgid, or sticky bit set.  The setuid,  setgid,  and
 202                  sticky bits are not copied to the target file.
 203
 204               o  When  used with --decompress --stdout and xz cannot recognize
 205                  the type of the source file, copy the source file  as  is  to
 206                  standard  output.   This allows xzcat --force to be used like
 207                  cat(1) for files that have not been compressed with xz.  Note
 208                  that in future, xz might support new compressed file formats,
 209                  which may make xz decompress more types of files  instead  of
 210                  copying  them  as is to standard output.  --format=format can
 211                  be used to restrict xz to decompress only a single file  for-
 212                  mat.
 213
 214        -c, --stdout, --to-stdout
 215               Write  the  compressed  or  decompressed data to standard output
 216               instead of a file.  This implies --keep.
 217
 218        --no-sparse
 219               Disable creation of sparse files.  By default, if  decompressing
 220               into  a  regular  file,  xz tries to make the file sparse if the
 221               decompressed data contains long sequences of binary  zeros.   It
 222               also  works  when writing to standard output as long as standard
 223               output is connected to a regular  file  and  certain  additional
 224               conditions  are  met to make it safe.  Creating sparse files may
 225               save disk space and speed up the decompression by  reducing  the
 226               amount of disk I/O.
 227
 228        -S .suf, --suffix=.suf
 229               When  compressing,  use  .suf  as the suffix for the target file
 230               instead of .xz or .lzma.  If not writing to standard output  and
 231               the  source  file already has the suffix .suf, a warning is dis-
 232               played and the file is skipped.
 233
 234               When decompressing, recognize files  with  the  suffix  .suf  in
 235               addition to files with the .xz, .txz, .lzma, or .tlz suffix.  If
 236               the source file has the suffix .suf, the suffix  is  removed  to
 237               get the target filename.
 238
 239               When  compressing  or  decompressing raw streams (--format=raw),
 240               the suffix must always be specified unless writing  to  standard
 241               output, because there is no default suffix for raw streams.
 242
 243        --files[=file]
 244               Read  the  filenames  to  process from file; if file is omitted,
 245               filenames are read from standard input.  Filenames must be  ter-
 246               minated  with  the  newline character.  A dash (-) is taken as a
 247               regular filename; it doesn't mean standard input.  If  filenames
 248               are  given  also  as  command line arguments, they are processed
 249               before the filenames read from file.
 250
 251        --files0[=file]
 252               This is identical to --files[=file] except  that  each  filename
 253               must be terminated with the null character.
 254
 255    Basic file format and compression options
 256        -F format, --format=format
 257               Specify the file format to compress or decompress:
 258
 259               auto   This  is  the default.  When compressing, auto is equiva-
 260                      lent to xz.  When decompressing, the format of the  input
 261                      file  is  automatically  detected.  Note that raw streams
 262                      (created with --format=raw) cannot be auto-detected.
 263
 264               xz     Compress to the .xz file format, or accept only .xz files
 265                      when decompressing.
 266
 267               lzma, alone
 268                      Compress  to the legacy .lzma file format, or accept only
 269                      .lzma files when  decompressing.   The  alternative  name
 270                      alone  is  provided for backwards compatibility with LZMA
 271                      Utils.
 272
 273               raw    Compress or uncompress a raw stream (no  headers).   This
 274                      is meant for advanced users only.  To decode raw streams,
 275                      you need use --format=raw and explicitly specify the fil-
 276                      ter  chain,  which normally would have been stored in the
 277                      container headers.
 278
 279        -C check, --check=check
 280               Specify the type of the integrity check.  The  check  is  calcu-
 281               lated  from  the  uncompressed  data and stored in the .xz file.
 282               This option has an effect only when  compressing  into  the  .xz
 283               format;  the .lzma format doesn't support integrity checks.  The
 284               integrity check (if any) is verified when the .xz file is decom-
 285               pressed.
 286
 287               Supported check types:
 288
 289               none   Don't  calculate an integrity check at all.  This is usu-
 290                      ally a bad idea.  This can be useful  when  integrity  of
 291                      the data is verified by other means anyway.
 292
 293               crc32  Calculate  CRC32  using  the  polynomial  from IEEE-802.3
 294                      (Ethernet).
 295
 296               crc64  Calculate CRC64 using the polynomial from ECMA-182.  This
 297                      is the default, since it is slightly better than CRC32 at
 298                      detecting damaged files and the speed difference is  neg-
 299                      ligible.
 300
 301               sha256 Calculate  SHA-256.   This  is somewhat slower than CRC32
 302                      and CRC64.
 303
 304               Integrity of the .xz headers is always verified with CRC32.   It
 305               is not possible to change or disable it.
 306
 307        -0 ... -9
 308               Select  a compression preset level.  The default is -6.  If mul-
 309               tiple preset levels are specified, the last  one  takes  effect.
 310               If  a  custom filter chain was already specified, setting a com-
 311               pression preset level clears the custom filter chain.
 312
 313               The differences between the presets are  more  significant  than
 314               with  gzip(1)  and  bzip2(1).  The selected compression settings
 315               determine the memory  requirements  of  the  decompressor,  thus
 316               using  a  too  high preset level might make it painful to decom-
 317               press the file on an old system with little RAM.   Specifically,
 318               it's  not  a  good idea to blindly use -9 for everything like it
 319               often is with gzip(1) and bzip2(1).
 320
 321               -0 ... -3
 322                      These are somewhat fast presets.  -0 is sometimes  faster
 323                      than  gzip  -9 while compressing much better.  The higher
 324                      ones often have speed comparable to bzip2(1) with  compa-
 325                      rable  or  better compression ratio, although the results
 326                      depend a lot on the type of data being compressed.
 327
 328               -4 ... -6
 329                      Good to very good compression while keeping  decompressor
 330                      memory  usage reasonable even for old systems.  -6 is the
 331                      default, which is usually a good  choice  e.g.  for  dis-
 332                      tributing  files  that  need to be decompressible even on
 333                      systems with only 16 MiB RAM.  (-5e or -6e may  be  worth
 334                      considering too.  See --extreme.)
 335
 336               -7 ... -9
 337                      These  are  like -6 but with higher compressor and decom-
 338                      pressor memory requirements.  These are useful only  when
 339                      compressing  files bigger than 8 MiB, 16 MiB, and 32 MiB,
 340                      respectively.
 341
 342               On the same hardware, the decompression speed is approximately a
 343               constant  number  of  bytes  of  compressed data per second.  In
 344               other words, the better the compression, the faster  the  decom-
 345               pression  will  usually  be.  This also means that the amount of
 346               uncompressed output produced per second can vary a lot.
 347
 348               The following table summarises the features of the presets:
 349
 350                      Preset   DictSize   CompCPU   CompMem   DecMem
 351                        -0     256 KiB       0        3 MiB    1 MiB
 352                        -1       1 MiB       1        9 MiB    2 MiB
 353                        -2       2 MiB       2       17 MiB    3 MiB
 354                        -3       4 MiB       3       32 MiB    5 MiB
 355                        -4       4 MiB       4       48 MiB    5 MiB
 356                        -5       8 MiB       5       94 MiB    9 MiB
 357                        -6       8 MiB       6       94 MiB    9 MiB
 358                        -7      16 MiB       6      186 MiB   17 MiB
 359                        -8      32 MiB       6      370 MiB   33 MiB
 360                        -9      64 MiB       6      674 MiB   65 MiB
 361
 362               Column descriptions:
 363
 364               o  DictSize is the LZMA2 dictionary size.  It is waste of memory
 365                  to  use a dictionary bigger than the size of the uncompressed
 366                  file.  This is why it is good to avoid using the  presets  -7
 367                  ...  -9 when there's no real need for them.  At -6 and lower,
 368                  the amount of memory wasted is usually low enough to not mat-
 369                  ter.
 370
 371               o  CompCPU  is a simplified representation of the LZMA2 settings
 372                  that affect compression speed.  The dictionary  size  affects
 373                  speed too, so while CompCPU is the same for levels -6 ... -9,
 374                  higher levels still tend to be a little slower.  To get  even
 375                  slower and thus possibly better compression, see --extreme.
 376
 377               o  CompMem  contains  the  compressor memory requirements in the
 378                  single-threaded mode.  It may vary slightly between  xz  ver-
 379                  sions.   Memory  requirements  of  some  of the future multi-
 380                  threaded modes may be dramatically higher than  that  of  the
 381                  single-threaded mode.
 382
 383               o  DecMem  contains  the decompressor memory requirements.  That
 384                  is, the compression settings determine  the  memory  require-
 385                  ments  of  the  decompressor.   The exact decompressor memory
 386                  usage is slighly more than the LZMA2 dictionary size, but the
 387                  values  in  the  table  have been rounded up to the next full
 388                  MiB.
 389
 390        -e, --extreme
 391               Use a slower variant of the selected  compression  preset  level
 392               (-0  ...  -9)  to  hopefully get a little bit better compression
 393               ratio, but with bad luck this can also make  it  worse.   Decom-
 394               pressor  memory  usage  is  not  affected, but compressor memory
 395               usage increases a little at preset levels -0 ... -3.
 396
 397               Since there are two presets  with  dictionary  sizes  4 MiB  and
 398               8 MiB,  the  presets  -3e  and  -5e use slightly faster settings
 399               (lower CompCPU) than -4e and -6e, respectively.  That way no two
 400               presets are identical.
 401
 402                      Preset   DictSize   CompCPU   CompMem   DecMem
 403                       -0e     256 KiB       8        4 MiB    1 MiB
 404                       -1e       1 MiB       8       13 MiB    2 MiB
 405                       -2e       2 MiB       8       25 MiB    3 MiB
 406                       -3e       4 MiB       7       48 MiB    5 MiB
 407                       -4e       4 MiB       8       48 MiB    5 MiB
 408                       -5e       8 MiB       7       94 MiB    9 MiB
 409                       -6e       8 MiB       8       94 MiB    9 MiB
 410                       -7e      16 MiB       8      186 MiB   17 MiB
 411                       -8e      32 MiB       8      370 MiB   33 MiB
 412                       -9e      64 MiB       8      674 MiB   65 MiB
 413
 414               For  example,  there  are a total of four presets that use 8 MiB
 415               dictionary, whose order from the fastest to the slowest  is  -5,
 416               -6, -5e, and -6e.
 417
 418        --fast
 419        --best These  are  somewhat  misleading  aliases for -0 and -9, respec-
 420               tively.  These are provided  only  for  backwards  compatibility
 421               with LZMA Utils.  Avoid using these options.
 422
 423        --memlimit-compress=limit
 424               Set  a  memory  usage  limit for compression.  If this option is
 425               specified multiple times, the last one takes effect.
 426
 427               If the compression settings exceed the limit, xz will adjust the
 428               settings  downwards  so that the limit is no longer exceeded and
 429               display a notice  that  automatic  adjustment  was  done.   Such
 430               adjustments  are  not made when compressing with --format=raw or
 431               if --no-adjust has been specified.  In those cases, an error  is
 432               displayed and xz will exit with exit status 1.
 433
 434               The limit can be specified in multiple ways:
 435
 436               o  The  limit can be an absolute value in bytes.  Using an inte-
 437                  ger suffix like MiB can be useful.  Example:  --memlimit-com-
 438                  press=80MiB
 439
 440               o  The  limit can be specified as a percentage of total physical
 441                  memory (RAM).  This can be useful especially when setting the
 442                  XZ_DEFAULTS  environment  variable  in a shell initialization
 443                  script that is shared between different computers.  That  way
 444                  the  limit  is automatically bigger on systems with more mem-
 445                  ory.  Example: --memlimit-compress=70%
 446
 447               o  The limit can be reset back to its default value  by  setting
 448                  it  to  0.  This is currently equivalent to setting the limit
 449                  to max (no memory usage limit).  Once multithreading  support
 450                  has been implemented, there may be a difference between 0 and
 451                  max for the multithreaded case, so it is recommended to use 0
 452                  instead of max until the details have been decided.
 453
 454               See also the section Memory usage.
 455
 456        --memlimit-decompress=limit
 457               Set  a  memory usage limit for decompression.  This also affects
 458               the --list mode.  If  the  operation  is  not  possible  without
 459               exceeding  the limit, xz will display an error and decompressing
 460               the file will fail.  See --memlimit-compress=limit for  possible
 461               ways to specify the limit.
 462
 463        -M limit, --memlimit=limit, --memory=limit
 464               This   is  equivalent  to  specifying  --memlimit-compress=limit
 465               --memlimit-decompress=limit.
 466
 467        --no-adjust
 468               Display an error and exit if the compression settings exceed the
 469               memory usage limit.  The default is to adjust the settings down-
 470               wards so that the memory usage limit is not exceeded.  Automatic
 471               adjusting  is  always disabled when creating raw streams (--for-
 472               mat=raw).
 473
 474        -T threads, --threads=threads
 475               Specify the number of worker threads to use.  The actual  number
 476               of  threads can be less than threads if using more threads would
 477               exceed the memory usage limit.
 478
 479               Multithreaded compression and decompression are not  implemented
 480               yet, so this option has no effect for now.
 481
 482               As  of  writing  (2010-09-27), it hasn't been decided if threads
 483               will be used by default on multicore systems  once  support  for
 484               threading has been implemented.  Comments are welcome.  The com-
 485               plicating factor is that using many threads  will  increase  the
 486               memory  usage dramatically.  Note that if multithreading will be
 487               the default, it will probably be done  so  that  single-threaded
 488               and  multithreaded modes produce the same output, so compression
 489               ratio won't be  significantly  affected  if  threading  will  be
 490               enabled by default.
 491
 492    Custom compressor filter chains
 493        A  custom  filter  chain  allows specifying the compression settings in
 494        detail instead of relying on the settings associated to the preset lev-
 495        els.   When  a custom filter chain is specified, the compression preset
 496        level options (-0 ... -9 and --extreme) are silently ignored.
 497
 498        A filter chain is comparable to piping on the command line.  When  com-
 499        pressing, the uncompressed input goes to the first filter, whose output
 500        goes to the next filter (if any).  The output of the last  filter  gets
 501        written  to  the compressed file.  The maximum number of filters in the
 502        chain is four, but typically a filter chain has only one  or  two  fil-
 503        ters.
 504
 505        Many filters have limitations on where they can be in the filter chain:
 506        some filters can work only as the last filter in the chain,  some  only
 507        as  a  non-last  filter,  and  some  work in any position in the chain.
 508        Depending on the filter, this limitation is either inherent to the fil-
 509        ter design or exists to prevent security issues.
 510
 511        A  custom filter chain is specified by using one or more filter options
 512        in the order they are wanted in the filter chain.  That is,  the  order
 513        of  filter  options  is significant!  When decoding raw streams (--for-
 514        mat=raw), the filter chain is specified in the same  order  as  it  was
 515        specified when compressing.
 516
 517        Filters  take filter-specific options as a comma-separated list.  Extra
 518        commas in options are ignored.  Every option has a  default  value,  so
 519        you need to specify only those you want to change.
 520
 521        --lzma1[=options]
 522        --lzma2[=options]
 523               Add  LZMA1  or  LZMA2 filter to the filter chain.  These filters
 524               can be used only as the last filter in the chain.
 525
 526               LZMA1 is a legacy filter, which is supported almost  solely  due
 527               to  the  legacy  .lzma  file  format, which supports only LZMA1.
 528               LZMA2 is an updated version  of  LZMA1  to  fix  some  practical
 529               issues  of LZMA1.  The .xz format uses LZMA2 and doesn't support
 530               LZMA1 at all.  Compression speed and ratios of LZMA1  and  LZMA2
 531               are practically the same.
 532
 533               LZMA1 and LZMA2 share the same set of options:
 534
 535               preset=preset
 536                      Reset  all LZMA1 or LZMA2 options to preset.  Preset con-
 537                      sist of an integer, which may be followed by  single-let-
 538                      ter  preset  modifiers.   The integer can be from 0 to 9,
 539                      matching the command line options -0 ...  -9.   The  only
 540                      supported   modifier   is   currently  e,  which  matches
 541                      --extreme.  The default  preset  is  6,  from  which  the
 542                      default values for the rest of the LZMA1 or LZMA2 options
 543                      are taken.
 544
 545               dict=size
 546                      Dictionary (history buffer) size indicates how many bytes
 547                      of  the  recently  processed uncompressed data is kept in
 548                      memory.  The  algorithm  tries  to  find  repeating  byte
 549                      sequences (matches) in the uncompressed data, and replace
 550                      them with references to the data currently in the dictio-
 551                      nary.   The  bigger  the  dictionary,  the  higher is the
 552                      chance to find a match.  Thus, increasing dictionary size
 553                      usually improves compression ratio, but a dictionary big-
 554                      ger than the uncompressed file is waste of memory.
 555
 556                      Typical dictionary size is from 64 KiB  to  64 MiB.   The
 557                      minimum  is  4 KiB.   The maximum for compression is cur-
 558                      rently 1.5 GiB (1536 MiB).  The decompressor already sup-
 559                      ports  dictionaries up to one byte less than 4 GiB, which
 560                      is the maximum for the LZMA1 and LZMA2 stream formats.
 561
 562                      Dictionary size and match finder (mf) together  determine
 563                      the memory usage of the LZMA1 or LZMA2 encoder.  The same
 564                      (or bigger) dictionary size is required for decompressing
 565                      that  was used when compressing, thus the memory usage of
 566                      the decoder is determined by  the  dictionary  size  used
 567                      when  compressing.   The .xz headers store the dictionary
 568                      size either as 2^n or 2^n + 2^(n-1), so these  sizes  are
 569                      somewhat preferred for compression.  Other sizes will get
 570                      rounded up when stored in the .xz headers.
 571
 572               lc=lc  Specify the number of literal context bits.  The  minimum
 573                      is  0  and  the maximum is 4; the default is 3.  In addi-
 574                      tion, the sum of lc and lp must not exceed 4.
 575
 576                      All bytes that cannot be encoded as matches  are  encoded
 577                      as  literals.   That  is, literals are simply 8-bit bytes
 578                      that are encoded one at a time.
 579
 580                      The literal coding makes an assumption that  the  highest
 581                      lc  bits of the previous uncompressed byte correlate with
 582                      the next byte.  E.g. in typical English text,  an  upper-
 583                      case letter is often followed by a lower-case letter, and
 584                      a lower-case letter is usually followed by another lower-
 585                      case  letter.  In the US-ASCII character set, the highest
 586                      three bits are 010 for upper-case  letters  and  011  for
 587                      lower-case  letters.   When lc is at least 3, the literal
 588                      coding can take advantage of this property in the  uncom-
 589                      pressed data.
 590
 591                      The default value (3) is usually good.  If you want maxi-
 592                      mum compression, test lc=4.  Sometimes it helps a little,
 593                      and sometimes it makes compression worse.  If it makes it
 594                      worse, test e.g. lc=2 too.
 595
 596               lp=lp  Specify the number of literal position bits.  The minimum
 597                      is 0 and the maximum is 4; the default is 0.
 598
 599                      Lp  affects  what  kind  of alignment in the uncompressed
 600                      data is assumed when encoding literals.  See pb below for
 601                      more information about alignment.
 602
 603               pb=pb  Specify  the  number  of position bits.  The minimum is 0
 604                      and the maximum is 4; the default is 2.
 605
 606                      Pb affects what kind of  alignment  in  the  uncompressed
 607                      data  is assumed in general.  The default means four-byte
 608                      alignment (2^pb=2^2=4), which is often a good choice when
 609                      there's no better guess.
 610
 611                      When  the  aligment  is known, setting pb accordingly may
 612                      reduce the file size a little.  E.g. with text files hav-
 613                      ing  one-byte  alignment  (US-ASCII,  ISO-8859-*, UTF-8),
 614                      setting  pb=0  can  improve  compression  slightly.   For
 615                      UTF-16  text, pb=1 is a good choice.  If the alignment is
 616                      an odd number like  3  bytes,  pb=0  might  be  the  best
 617                      choice.
 618
 619                      Even though the assumed alignment can be adjusted with pb
 620                      and lp, LZMA1 and  LZMA2  still  slightly  favor  16-byte
 621                      alignment.   It  might  be worth taking into account when
 622                      designing file formats that are likely to be  often  com-
 623                      pressed with LZMA1 or LZMA2.
 624
 625               mf=mf  Match  finder has a major effect on encoder speed, memory
 626                      usage, and compression ratio.  Usually Hash  Chain  match
 627                      finders  are  faster than Binary Tree match finders.  The
 628                      default depends on the preset: 0 uses hc3, 1-3  use  hc4,
 629                      and the rest use bt4.
 630
 631                      The  following  match  finders are supported.  The memory
 632                      usage formulas below are rough approximations, which  are
 633                      closest to the reality when dict is a power of two.
 634
 635                      hc3    Hash Chain with 2- and 3-byte hashing
 636                             Minimum value for nice: 3
 637                             Memory usage:
 638                             dict * 7.5 (if dict <= 16 MiB);
 639                             dict * 5.5 + 64 MiB (if dict > 16 MiB)
 640
 641                      hc4    Hash Chain with 2-, 3-, and 4-byte hashing
 642                             Minimum value for nice: 4
 643                             Memory usage:
 644                             dict * 7.5 (if dict <= 32 MiB);
 645                             dict * 6.5 (if dict > 32 MiB)
 646
 647                      bt2    Binary Tree with 2-byte hashing
 648                             Minimum value for nice: 2
 649                             Memory usage: dict * 9.5
 650
 651                      bt3    Binary Tree with 2- and 3-byte hashing
 652                             Minimum value for nice: 3
 653                             Memory usage:
 654                             dict * 11.5 (if dict <= 16 MiB);
 655                             dict * 9.5 + 64 MiB (if dict > 16 MiB)
 656
 657                      bt4    Binary Tree with 2-, 3-, and 4-byte hashing
 658                             Minimum value for nice: 4
 659                             Memory usage:
 660                             dict * 11.5 (if dict <= 32 MiB);
 661                             dict * 10.5 (if dict > 32 MiB)
 662
 663               mode=mode
 664                      Compression mode specifies the method to analyze the data
 665                      produced by the match finder.  Supported modes  are  fast
 666                      and normal.  The default is fast for presets 0-3 and nor-
 667                      mal for presets 4-9.
 668
 669                      Usually fast is used with Hash Chain  match  finders  and
 670                      normal with Binary Tree match finders.  This is also what
 671                      the presets do.
 672
 673               nice=nice
 674                      Specify what is considered to be  a  nice  length  for  a
 675                      match.  Once a match of at least nice bytes is found, the
 676                      algorithm stops looking for possibly better matches.
 677
 678                      Nice can be 2-273 bytes.  Higher values tend to give bet-
 679                      ter  compression  ratio  at  the  expense  of speed.  The
 680                      default depends on the preset.
 681
 682               depth=depth
 683                      Specify the maximum search depth  in  the  match  finder.
 684                      The  default  is  the special value of 0, which makes the
 685                      compressor determine a reasonable depth from mf and nice.
 686
 687                      Reasonable depth for Hash Chains is 4-100 and 16-1000 for
 688                      Binary  Trees.  Using very high values for depth can make
 689                      the encoder extremely slow with some files.   Avoid  set-
 690                      ting  the  depth  over  1000  unless  you are prepared to
 691                      interrupt the compression in case it is  taking  far  too
 692                      long.
 693
 694               When  decoding  raw streams (--format=raw), LZMA2 needs only the
 695               dictionary size.  LZMA1 needs also lc, lp, and pb.
 696
 697        --x86[=options]
 698        --powerpc[=options]
 699        --ia64[=options]
 700        --arm[=options]
 701        --armthumb[=options]
 702        --sparc[=options]
 703               Add a branch/call/jump (BCJ) filter to the filter chain.   These
 704               filters  can  be  used  only  as a non-last filter in the filter
 705               chain.
 706
 707               A BCJ filter converts relative addresses in the machine code  to
 708               their  absolute  counterparts.   This doesn't change the size of
 709               the data, but it increases redundancy, which can help  LZMA2  to
 710               produce  0-15 %  smaller  .xz  file.  The BCJ filters are always
 711               reversible, so using a BCJ filter for wrong type of data doesn't
 712               cause  any data loss, although it may make the compression ratio
 713               slightly worse.
 714
 715               It is fine to apply a BCJ filter on a whole executable;  there's
 716               no  need to apply it only on the executable section.  Applying a
 717               BCJ filter on an archive that contains both executable and  non-
 718               executable  files may or may not give good results, so it gener-
 719               ally isn't good to blindly apply a BCJ filter  when  compressing
 720               binary packages for distribution.
 721
 722               These  BCJ filters are very fast and use insignificant amount of
 723               memory.  If a BCJ filter improves compression ratio of  a  file,
 724               it  can  improve  decompression speed at the same time.  This is
 725               because, on the same hardware, the decompression speed of  LZMA2
 726               is  roughly  a fixed number of bytes of compressed data per sec-
 727               ond.
 728
 729               These BCJ filters have known problems related to the compression
 730               ratio:
 731
 732               o  Some  types  of files containing executable code (e.g. object
 733                  files, static libraries, and Linux kernel modules)  have  the
 734                  addresses  in  the  instructions  filled  with filler values.
 735                  These BCJ filters will still do the address conversion, which
 736                  will make the compression worse with these files.
 737
 738               o  Applying a BCJ filter on an archive containing multiple simi-
 739                  lar executables can make the compression ratio worse than not
 740                  using  a  BCJ filter.  This is because the BCJ filter doesn't
 741                  detect the boundaries of the executable  files,  and  doesn't
 742                  reset the address conversion counter for each executable.
 743
 744               Both  of the above problems will be fixed in the future in a new
 745               filter.  The old BCJ filters will still be  useful  in  embedded
 746               systems,  because  the  decoder of the new filter will be bigger
 747               and use more memory.
 748
 749               Different instruction sets have have different alignment:
 750
 751                      Filter      Alignment   Notes
 752                      x86             1       32-bit or 64-bit x86
 753                      PowerPC         4       Big endian only
 754                      ARM             4       Little endian only
 755                      ARM-Thumb       2       Little endian only
 756                      IA-64          16       Big or little endian
 757                      SPARC           4       Big or little endian
 758
 759               Since the BCJ-filtered data is usually  compressed  with  LZMA2,
 760               the  compression  ratio  may  be  improved slightly if the LZMA2
 761               options are set to match the alignment of the selected BCJ  fil-
 762               ter.   For example, with the IA-64 filter, it's good to set pb=4
 763               with LZMA2 (2^4=16).  The x86 filter is an exception; it's  usu-
 764               ally  good  to stick to LZMA2's default four-byte alignment when
 765               compressing x86 executables.
 766
 767               All BCJ filters support the same options:
 768
 769               start=offset
 770                      Specify the start offset that  is  used  when  converting
 771                      between relative and absolute addresses.  The offset must
 772                      be a multiple of the alignment of the filter (see the ta-
 773                      ble  above).   The  default  is  zero.   In practice, the
 774                      default is good; specifying a  custom  offset  is  almost
 775                      never useful.
 776
 777        --delta[=options]
 778               Add  the Delta filter to the filter chain.  The Delta filter can
 779               be only used as a non-last filter in the filter chain.
 780
 781               Currently only simple byte-wise delta calculation is  supported.
 782               It  can  be  useful  when  compressing  e.g. uncompressed bitmap
 783               images or uncompressed  PCM  audio.   However,  special  purpose
 784               algorithms  may  give  significantly better results than Delta +
 785               LZMA2.  This is true especially  with  audio,  which  compresses
 786               faster and better e.g. with flac(1).
 787
 788               Supported options:
 789
 790               dist=distance
 791                      Specify  the  distance of the delta calculation in bytes.
 792                      distance must be 1-256.  The default is 1.
 793
 794                      For example, with dist=2 and eight-byte input A1 B1 A2 B3
 795                      A3 B5 A4 B7, the output will be A1 B1 01 02 01 02 01 02.
 796
 797    Other options
 798        -q, --quiet
 799               Suppress  warnings  and notices.  Specify this twice to suppress
 800               errors too.  This option has no effect on the exit status.  That
 801               is,  even  if a warning was suppressed, the exit status to indi-
 802               cate a warning is still used.
 803
 804        -v, --verbose
 805               Be verbose.  If standard error is connected to  a  terminal,  xz
 806               will  display  a progress indicator.  Specifying --verbose twice
 807               will give even more verbose output.
 808
 809               The progress indicator shows the following information:
 810
 811               o  Completion percentage is shown if the size of the input  file
 812                  is known.  That is, the percentage cannot be shown in pipes.
 813
 814               o  Amount  of compressed data produced (compressing) or consumed
 815                  (decompressing).
 816
 817               o  Amount of uncompressed data consumed  (compressing)  or  pro-
 818                  duced (decompressing).
 819
 820               o  Compression ratio, which is calculated by dividing the amount
 821                  of compressed data processed so far by the amount  of  uncom-
 822                  pressed data processed so far.
 823
 824               o  Compression  or decompression speed.  This is measured as the
 825                  amount of uncompressed data consumed  (compression)  or  pro-
 826                  duced  (decompression)  per  second.  It is shown after a few
 827                  seconds have passed since xz started processing the file.
 828
 829               o  Elapsed time in the format M:SS or H:MM:SS.
 830
 831               o  Estimated remaining time is shown only when the size  of  the
 832                  input  file  is  known  and  a couple of seconds have already
 833                  passed since xz started processing the  file.   The  time  is
 834                  shown  in  a  less precise format which never has any colons,
 835                  e.g. 2 min 30 s.
 836
 837               When standard error is not a terminal, --verbose  will  make  xz
 838               print the filename, compressed size, uncompressed size, compres-
 839               sion ratio, and possibly also the speed and elapsed  time  on  a
 840               single line to standard error after compressing or decompressing
 841               the file.  The speed and elapsed time are included only when the
 842               operation  took at least a few seconds.  If the operation didn't
 843               finish, e.g. due to user interruption, also the completion  per-
 844               centage is printed if the size of the input file is known.
 845
 846        -Q, --no-warn
 847               Don't set the exit status to 2 even if a condition worth a warn-
 848               ing was detected.  This  option  doesn't  affect  the  verbosity
 849               level,  thus  both  --quiet and --no-warn have to be used to not
 850               display warnings and to not alter the exit status.
 851
 852        --robot
 853               Print messages in a machine-parsable format.  This  is  intended
 854               to  ease  writing  frontends  that  want  to  use  xz instead of
 855               liblzma, which may be the case with various scripts.  The output
 856               with  this  option  enabled  is  meant  to  be  stable across xz
 857               releases.  See the section ROBOT MODE for details.
 858
 859        --info-memory
 860               Display, in human-readable  format,  how  much  physical  memory
 861               (RAM)  xz  thinks the system has and the memory usage limits for
 862               compression and decompression, and exit successfully.
 863
 864        -h, --help
 865               Display  a  help  message  describing  the  most  commonly  used
 866               options, and exit successfully.
 867
 868        -H, --long-help
 869               Display  a  help message describing all features of xz, and exit
 870               successfully
 871
 872        -V, --version
 873               Display the version number of xz and liblzma in  human  readable
 874               format.   To get machine-parsable output, specify --robot before
 875               --version.
 876
 877 ROBOT MODE
 878        The robot mode is activated with the --robot option.  It makes the out-
 879        put of xz easier to parse by other programs.  Currently --robot is sup-
 880        ported only together with --version,  --info-memory,  and  --list.   It
 881        will  be  supported  for  normal  compression  and decompression in the
 882        future.
 883
 884    Version
 885        xz --robot --version will print the version number of xz and liblzma in
 886        the following format:
 887
 888        XZ_VERSION=XYYYZZZS
 889        LIBLZMA_VERSION=XYYYZZZS
 890
 891        X      Major version.
 892
 893        YYY    Minor  version.  Even numbers are stable.  Odd numbers are alpha
 894               or beta versions.
 895
 896        ZZZ    Patch level for stable releases or just a counter  for  develop-
 897               ment releases.
 898
 899        S      Stability.  0 is alpha, 1 is beta, and 2 is stable.  S should be
 900               always 2 when YYY is even.
 901
 902        XYYYZZZS are the same on both lines if xz and liblzma are from the same
 903        XZ Utils release.
 904
 905        Examples: 4.999.9beta is 49990091 and 5.0.0 is 50000002.
 906
 907    Memory limit information
 908        xz  --robot --info-memory prints a single line with three tab-separated
 909        columns:
 910
 911        1.  Total amount of physical memory (RAM) in bytes
 912
 913        2.  Memory usage limit for compression in bytes.  A  special  value  of
 914            zero  indicates the default setting, which for single-threaded mode
 915            is the same as no limit.
 916
 917        3.  Memory usage limit for decompression in bytes.  A special value  of
 918            zero  indicates the default setting, which for single-threaded mode
 919            is the same as no limit.
 920
 921        In the future, the output of xz --robot  --info-memory  may  have  more
 922        columns, but never more than a single line.
 923
 924    List mode
 925        xz --robot --list uses tab-separated output.  The first column of every
 926        line has a string that indicates the type of the information  found  on
 927        that line:
 928
 929        name   This is always the first line when starting to list a file.  The
 930               second column on the line is the filename.
 931
 932        file   This line contains overall information about the .xz file.  This
 933               line is always printed after the name line.
 934
 935        stream This line type is used only when --verbose was specified.  There
 936               are as many stream lines as there are streams in the .xz file.
 937
 938        block  This line type is used only when --verbose was specified.  There
 939               are  as  many  block  lines as there are blocks in the .xz file.
 940               The block lines are shown after all the stream lines;  different
 941               line types are not interleaved.
 942
 943        summary
 944               This  line type is used only when --verbose was specified twice.
 945               This line is printed after all block lines.  Like the file line,
 946               the  summary  line  contains  overall  information about the .xz
 947               file.
 948
 949        totals This line is always the very last line of the list  output.   It
 950               shows the total counts and sizes.
 951
 952        The columns of the file lines:
 953               2.  Number of streams in the file
 954               3.  Total number of blocks in the stream(s)
 955               4.  Compressed size of the file
 956               5.  Uncompressed size of the file
 957               6.  Compression  ratio,  for  example  0.123.   If ratio is over
 958                   9.999, three dashes  (---)  are  displayed  instead  of  the
 959                   ratio.
 960               7.  Comma-separated  list of integrity check names.  The follow-
 961                   ing strings are used for the known check types: None, CRC32,
 962                   CRC64,  and  SHA-256.  For unknown check types, Unknown-N is
 963                   used, where N is the Check ID as a decimal  number  (one  or
 964                   two digits).
 965               8.  Total size of stream padding in the file
 966
 967        The columns of the stream lines:
 968               2.  Stream number (the first stream is 1)
 969               3.  Number of blocks in the stream
 970               4.  Compressed start offset
 971               5.  Uncompressed start offset
 972               6.  Compressed size (does not include stream padding)
 973               7.  Uncompressed size
 974               8.  Compression ratio
 975               9.  Name of the integrity check
 976               10. Size of stream padding
 977
 978        The columns of the block lines:
 979               2.  Number of the stream containing this block
 980               3.  Block  number  relative  to the beginning of the stream (the
 981                   first block is 1)
 982               4.  Block number relative to the beginning of the file
 983               5.  Compressed start offset relative to  the  beginning  of  the
 984                   file
 985               6.  Uncompressed  start  offset relative to the beginning of the
 986                   file
 987               7.  Total compressed size of the block (includes headers)
 988               8.  Uncompressed size
 989               9.  Compression ratio
 990               10. Name of the integrity check
 991
 992        If --verbose was specified twice, additional columns  are  included  on
 993        the  block  lines.   These  are  not displayed with a single --verbose,
 994        because getting this information requires many seeks and  can  thus  be
 995        slow:
 996               11. Value of the integrity check in hexadecimal
 997               12. Block header size
 998               13. Block  flags:  c  indicates that compressed size is present,
 999                   and u indicates that uncompressed size is present.   If  the
1000                   flag  is  not  set,  a dash (-) is shown instead to keep the
1001                   string length fixed.  New flags may be added to the  end  of
1002                   the string in the future.
1003               14. Size  of  the  actual  compressed  data  in  the block (this
1004                   excludes the block header, block padding, and check fields)
1005               15. Amount of memory (in  bytes)  required  to  decompress  this
1006                   block with this xz version
1007               16. Filter  chain.   Note  that most of the options used at com-
1008                   pression time cannot be known, because only the options that
1009                   are needed for decompression are stored in the .xz headers.
1010
1011        The columns of the totals line:
1012               2.  Number of streams
1013               3.  Number of blocks
1014               4.  Compressed size
1015               5.  Uncompressed size
1016               6.  Average compression ratio
1017               7.  Comma-separated  list  of  integrity  check  names that were
1018                   present in the files
1019               8.  Stream padding size
1020               9.  Number of files.  This is here to keep the order of the ear-
1021                   lier columns the same as on file lines.
1022
1023        If  --verbose  was  specified twice, additional columns are included on
1024        the totals line:
1025               10. Maximum amount of memory (in bytes) required  to  decompress
1026                   the files with this xz version
1027               11. yes  or  no  indicating  if all block headers have both com-
1028                   pressed size and uncompressed size stored in them
1029
1030        Future versions may add new line types and new columns can be added  to
1031        the existing line types, but the existing columns won't be changed.
1032
1033 EXIT STATUS
1034        0      All is good.
1035
1036        1      An error occurred.
1037
1038        2      Something  worth  a  warning  occurred,  but  no  actual  errors
1039               occurred.
1040
1041        Notices (not warnings or errors) printed on standard error don't affect
1042        the exit status.
1043
1044 ENVIRONMENT
1045        xz  parses  space-separated lists of options from the environment vari-
1046        ables XZ_DEFAULTS and XZ_OPT, in this order, before parsing the options
1047        from  the  command  line.   Note  that only options are parsed from the
1048        environment variables; all non-options are silently  ignored.   Parsing
1049        is  done  with  getopt_long(3)  which is used also for the command line
1050        arguments.
1051
1052        XZ_DEFAULTS
1053               User-specific or system-wide default options.  Typically this is
1054               set in a shell initialization script to enable xz's memory usage
1055               limiter by default.  Excluding shell initialization scripts  and
1056               similar   special   cases,  scripts  must  never  set  or  unset
1057               XZ_DEFAULTS.
1058
1059        XZ_OPT This is for passing options to xz when it is not possible to set
1060               the  options  directly on the xz command line.  This is the case
1061               e.g. when xz is run by a script or tool, e.g. GNU tar(1):
1062
1063                      XZ_OPT=-2v tar caf foo.tar.xz foo
1064
1065               Scripts may use XZ_OPT e.g. to set script-specific default  com-
1066               pression  options.   It  is  still recommended to allow users to
1067               override XZ_OPT if that is reasonable, e.g. in sh(1) scripts one
1068               may use something like this:
1069
1070                      XZ_OPT=${XZ_OPT-"-7e"}
1071                      export XZ_OPT
1072
1073 LZMA UTILS COMPATIBILITY
1074        The  command  line  syntax  of  xz  is  practically a superset of lzma,
1075        unlzma, and lzcat as found from LZMA Utils 4.32.x.  In most  cases,  it
1076        is possible to replace LZMA Utils with XZ Utils without breaking exist-
1077        ing scripts.  There are some incompatibilities though, which may  some-
1078        times cause problems.
1079
1080    Compression preset levels
1081        The  numbering  of the compression level presets is not identical in xz
1082        and LZMA Utils.  The most important difference is how dictionary  sizes
1083        are  mapped  to different presets.  Dictionary size is roughly equal to
1084        the decompressor memory usage.
1085
1086               Level     xz      LZMA Utils
1087                -0     256 KiB      N/A
1088                -1       1 MiB     64 KiB
1089                -2       2 MiB      1 MiB
1090                -3       4 MiB    512 KiB
1091                -4       4 MiB      1 MiB
1092
1093                -5       8 MiB      2 MiB
1094                -6       8 MiB      4 MiB
1095                -7      16 MiB      8 MiB
1096                -8      32 MiB     16 MiB
1097                -9      64 MiB     32 MiB
1098
1099        The dictionary size differences affect the compressor memory usage too,
1100        but  there  are some other differences between LZMA Utils and XZ Utils,
1101        which make the difference even bigger:
1102
1103               Level     xz      LZMA Utils 4.32.x
1104                -0       3 MiB          N/A
1105                -1       9 MiB          2 MiB
1106                -2      17 MiB         12 MiB
1107                -3      32 MiB         12 MiB
1108                -4      48 MiB         16 MiB
1109                -5      94 MiB         26 MiB
1110                -6      94 MiB         45 MiB
1111                -7     186 MiB         83 MiB
1112                -8     370 MiB        159 MiB
1113                -9     674 MiB        311 MiB
1114
1115        The default preset level in LZMA Utils is -7 while in XZ  Utils  it  is
1116        -6, so both use an 8 MiB dictionary by default.
1117
1118    Streamed vs. non-streamed .lzma files
1119        The  uncompressed  size  of the file can be stored in the .lzma header.
1120        LZMA Utils does that when compressing regular files.   The  alternative
1121        is  to  mark  that  uncompressed size is unknown and use end-of-payload
1122        marker to indicate where the decompressor should stop.  LZMA Utils uses
1123        this  method  when uncompressed size isn't known, which is the case for
1124        example in pipes.
1125
1126        xz supports decompressing .lzma files with  or  without  end-of-payload
1127        marker,  but  all  .lzma  files  created  by xz will use end-of-payload
1128        marker and have uncompressed  size  marked  as  unknown  in  the  .lzma
1129        header.   This may be a problem in some uncommon situations.  For exam-
1130        ple, a .lzma decompressor in an embedded device might  work  only  with
1131        files  that have known uncompressed size.  If you hit this problem, you
1132        need to use LZMA Utils or LZMA SDK to create  .lzma  files  with  known
1133        uncompressed size.
1134
1135    Unsupported .lzma files
1136        The .lzma format allows lc values up to 8, and lp values up to 4.  LZMA
1137        Utils can decompress files with any lc and lp, but always creates files
1138        with  lc=3  and  lp=0.  Creating files with other lc and lp is possible
1139        with xz and with LZMA SDK.
1140
1141        The implementation of the LZMA1 filter in liblzma requires that the sum
1142        of  lc  and lp must not exceed 4.  Thus, .lzma files, which exceed this
1143        limitation, cannot be decompressed with xz.
1144
1145        LZMA Utils creates only .lzma files which have a dictionary size of 2^n
1146        (a  power  of  2)  but accepts files with any dictionary size.  liblzma
1147        accepts only .lzma files which have a dictionary size of 2^n or  2^n  +
1148        2^(n-1).   This  is  to  decrease  false positives when detecting .lzma
1149        files.
1150
1151        These limitations shouldn't be a problem in practice, since practically
1152        all  .lzma  files  have been compressed with settings that liblzma will
1153        accept.
1154
1155    Trailing garbage
1156        When decompressing, LZMA Utils silently  ignore  everything  after  the
1157        first  .lzma  stream.   In  most  situations, this is a bug.  This also
1158        means that LZMA Utils don't support  decompressing  concatenated  .lzma
1159        files.
1160
1161        If  there  is  data left after the first .lzma stream, xz considers the
1162        file to be corrupt.  This may break obscure scripts which have  assumed
1163        that trailing garbage is ignored.
1164
1165 NOTES
1166    Compressed output may vary
1167        The  exact  compressed output produced from the same uncompressed input
1168        file may vary between XZ Utils versions even if compression options are
1169        identical.  This is because the encoder can be improved (faster or bet-
1170        ter compression) without affecting the file  format.   The  output  can
1171        vary  even  between  different  builds of the same XZ Utils version, if
1172        different build options are used.
1173
1174        The above means that implementing --rsyncable to create  rsyncable  .xz
1175        files  is  not  going  to happen without freezing a part of the encoder
1176        implementation, which can then be used with --rsyncable.
1177
1178    Embedded .xz decompressors
1179        Embedded .xz decompressor implementations like XZ Embedded don't neces-
1180        sarily support files created with integrity check types other than none
1181        and  crc32.   Since  the  default  is  --check=crc64,  you   must   use
1182        --check=none or --check=crc32 when creating files for embedded systems.
1183
1184        Outside  embedded systems, all .xz format decompressors support all the
1185        check types, or at least are able to decompress the file without  veri-
1186        fying the integrity check if the particular check is not supported.
1187
1188        XZ  Embedded supports BCJ filters, but only with the default start off-
1189        set.
1190
1191 EXAMPLES
1192    Basics
1193        Compress the file foo into foo.xz using the default  compression  level
1194        (-6), and remove foo if compression is successful:
1195
1196               xz foo
1197
1198        Decompress  bar.xz  into bar and don't remove bar.xz even if decompres-
1199        sion is successful:
1200
1201               xz -dk bar.xz
1202
1203        Create baz.tar.xz with the preset -4e (-4 --extreme), which  is  slower
1204        than  e.g.  the  default  -6, but needs less memory for compression and
1205        decompression (48 MiB and 5 MiB, respectively):
1206
1207               tar cf - baz | xz -4e > baz.tar.xz
1208
1209        A mix of compressed and uncompressed files can be decompressed to stan-
1210        dard output with a single command:
1211
1212               xz -dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt
1213
1214    Parallel compression of many files
1215        On  GNU  and *BSD, find(1) and xargs(1) can be used to parallelize com-
1216        pression of many files:
1217
1218               find . -type f \! -name '*.xz' -print0 \
1219                   | xargs -0r -P4 -n16 xz -T1
1220
1221        The -P option to xargs(1) sets the number  of  parallel  xz  processes.
1222        The best value for the -n option depends on how many files there are to
1223        be compressed.  If there are only a couple of files, the  value  should
1224        probably be 1; with tens of thousands of files, 100 or even more may be
1225        appropriate to reduce the number of xz  processes  that  xargs(1)  will
1226        eventually create.
1227
1228        The  option  -T1  for  xz is there to force it to single-threaded mode,
1229        because xargs(1) is used to control the amount of parallelization.
1230
1231    Robot mode
1232        Calculate how many bytes have been saved  in  total  after  compressing
1233        multiple files:
1234
1235               xz --robot --list *.xz | awk '/^totals/{print $5-$4}'
1236
1237        A  script may want to know that it is using new enough xz.  The follow-
1238        ing sh(1) script checks that the version number of the xz  tool  is  at
1239        least  5.0.0.   This method is compatible with old beta versions, which
1240        didn't support the --robot option:
1241
1242               if ! eval "$(xz --robot --version 2> /dev/null)" ||
1243                       [ "$XZ_VERSION" -lt 50000002 ]; then
1244                   echo "Your xz is too old."
1245               fi
1246               unset XZ_VERSION LIBLZMA_VERSION
1247
1248        Set a memory usage limit for decompression using XZ_OPT, but if a limit
1249        has already been set, don't increase it:
1250
1251               NEWLIM=$((123 << 20))  # 123 MiB
1252               OLDLIM=$(xz --robot --info-memory | cut -f3)
1253               if [ $OLDLIM -eq 0 -o $OLDLIM -gt $NEWLIM ]; then
1254                   XZ_OPT="$XZ_OPT --memlimit-decompress=$NEWLIM"
1255                   export XZ_OPT
1256               fi
1257
1258    Custom compressor filter chains
1259        The  simplest  use for custom filter chains is customizing a LZMA2 pre-
1260        set.  This can be useful, because the presets cover only  a  subset  of
1261        the potentially useful combinations of compression settings.
1262
1263        The  CompCPU columns of the tables from the descriptions of the options
1264        -0 ... -9 and --extreme are  useful  when  customizing  LZMA2  presets.
1265        Here are the relevant parts collected from those two tables:
1266
1267               Preset   CompCPU
1268                -0         0
1269                -1         1
1270                -2         2
1271                -3         3
1272                -4         4
1273                -5         5
1274                -6         6
1275                -5e        7
1276                -6e        8
1277
1278        If  you know that a file requires somewhat big dictionary (e.g. 32 MiB)
1279        to compress well, but you want to compress it quicker than xz -8  would
1280        do, a preset with a low CompCPU value (e.g. 1) can be modified to use a
1281        bigger dictionary:
1282
1283               xz --lzma2=preset=1,dict=32MiB foo.tar
1284
1285        With certain files, the above command may be faster than  xz  -6  while
1286        compressing  significantly better.  However, it must be emphasized that
1287        only some files benefit from a big dictionary while keeping the CompCPU
1288        value low.  The most obvious situation, where a big dictionary can help
1289        a lot, is an archive containing very similar files of at  least  a  few
1290        megabytes  each.   The  dictionary  size has to be significantly bigger
1291        than any individual file to allow LZMA2 to take full advantage  of  the
1292        similarities between consecutive files.
1293
1294        If  very high compressor and decompressor memory usage is fine, and the
1295        file being compressed is at least several hundred megabytes, it may  be
1296        useful  to  use  an  even  bigger dictionary than the 64 MiB that xz -9
1297        would use:
1298
1299               xz -vv --lzma2=dict=192MiB big_foo.tar
1300
1301        Using -vv (--verbose --verbose) like in the above example can be useful
1302        to  see  the  memory  requirements  of the compressor and decompressor.
1303        Remember that using a dictionary bigger than the  size  of  the  uncom-
1304        pressed  file is waste of memory, so the above command isn't useful for
1305        small files.
1306
1307        Sometimes the compression time doesn't  matter,  but  the  decompressor
1308        memory  usage has to be kept low e.g. to make it possible to decompress
1309        the file on an embedded system.  The following  command  uses  -6e  (-6
1310        --extreme)  as  a  base  and  sets  the dictionary to only 64 KiB.  The
1311        resulting file can be decompressed with XZ Embedded (that's  why  there
1312        is --check=crc32) using about 100 KiB of memory.
1313
1314               xz --check=crc32 --lzma2=preset=6e,dict=64KiB foo
1315
1316        If  you  want  to  squeeze out as many bytes as possible, adjusting the
1317        number of literal context bits (lc) and number of  position  bits  (pb)
1318        can sometimes help.  Adjusting the number of literal position bits (lp)
1319        might help too, but usually lc and  pb  are  more  important.   E.g.  a
1320        source  code  archive  contains mostly US-ASCII text, so something like
1321        the following might give slightly (like 0.1 %) smaller file than xz -6e
1322        (try also without lc=4):
1323
1324               xz --lzma2=preset=6e,pb=0,lc=4 source_code.tar
1325
1326        Using  another  filter together with LZMA2 can improve compression with
1327        certain file types.  E.g. to compress a x86-32 or x86-64 shared library
1328        using the x86 BCJ filter:
1329
1330               xz --x86 --lzma2 libfoo.so
1331
1332        Note  that the order of the filter options is significant.  If --x86 is
1333        specified after --lzma2, xz will give an error, because there cannot be
1334        any  filter  after LZMA2, and also because the x86 BCJ filter cannot be
1335        used as the last filter in the chain.
1336
1337        The Delta filter together with LZMA2 can give good results with  bitmap
1338        images.  It should usually beat PNG, which has a few more advanced fil-
1339        ters than simple delta but uses Deflate for the actual compression.
1340
1341        The image has to be saved in uncompressed format, e.g. as  uncompressed
1342        TIFF.   The  distance parameter of the Delta filter is set to match the
1343        number of bytes per pixel in the image.  E.g. 24-bit RGB  bitmap  needs
1344        dist=3,  and  it  is also good to pass pb=0 to LZMA2 to accommodate the
1345        three-byte alignment:
1346
1347               xz --delta=dist=3 --lzma2=pb=0 foo.tiff
1348
1349        If multiple images have been put into a single archive (e.g. .tar), the
1350        Delta  filter will work on that too as long as all images have the same
1351        number of bytes per pixel.
1352
1353 SEE ALSO
1354        xzdec(1),  xzdiff(1),   xzgrep(1),   xzless(1),   xzmore(1),   gzip(1),
1355        bzip2(1), 7z(1)
1356
1357        XZ Utils: <http://tukaani.org/xz/>
1358        XZ Embedded: <http://tukaani.org/xz/embedded.html>
1359        LZMA SDK: <http://7-zip.org/sdk.html>
1360
1361
1362
1363 Tukaani                           2010-10-04                             XZ(1)