src/parallel.pod

   1 #!/usr/bin/perl -w
   2
   3 =encoding utf8
   4
   5 =head1 NAME
   6
   7 parallel - build and execute shell command lines from standard input
   8 in parallel
   9
  10
  11 =head1 SYNOPSIS
  12
  13 B<parallel> [options] [I<command> [arguments]] < list_of_arguments
  14
  15 B<parallel> [options] [I<command> [arguments]] ( B<:::> arguments |
  16 B<:::+> arguments | B<::::> argfile(s) | B<::::+> argfile(s) ) ...
  17
  18 B<parallel> --semaphore [options] I<command>
  19
  20 B<#!/usr/bin/parallel> --shebang [options] [I<command> [arguments]]
  21
  22 B<#!/usr/bin/parallel> --shebang-wrap [options] [I<command>
  23 [arguments]]
  24
  25
  26 =head1 DESCRIPTION
  27
  28 STOP!
  29
  30 Read the B<Reader's guide> below if you are new to GNU B<parallel>.
  31
  32 GNU B<parallel> is a shell tool for executing jobs in parallel using
  33 one or more computers. A job can be a single command or a small script
  34 that has to be run for each of the lines in the input. The typical
  35 input is a list of files, a list of hosts, a list of users, a list of
  36 URLs, or a list of tables. A job can also be a command that reads from
  37 a pipe. GNU B<parallel> can then split the input into blocks and pipe
  38 a block into each command in parallel.
  39
  40 If you use xargs and tee today you will find GNU B<parallel> very easy
  41 to use as GNU B<parallel> is written to have the same options as
  42 xargs. If you write loops in shell, you will find GNU B<parallel> may
  43 be able to replace most of the loops and make them run faster by
  44 running several jobs in parallel.
  45
  46 GNU B<parallel> makes sure output from the commands is the same output
  47 as you would get had you run the commands sequentially. This makes it
  48 possible to use output from GNU B<parallel> as input for other
  49 programs.
  50
  51 For each line of input GNU B<parallel> will execute I<command> with
  52 the line as arguments. If no I<command> is given, the line of input is
  53 executed. Several lines will be run in parallel. GNU B<parallel> can
  54 often be used as a substitute for B<xargs> or B<cat | bash>.
  55
  56 =head2 Reader's guide
  57
  58 If you prefer reading a book buy B<GNU Parallel 2018> at
  59 http://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html
  60 or download it at: https://doi.org/10.5281/zenodo.1146014
  61
  62 Otherwise start by watching the intro videos for a quick introduction:
  63 http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
  64
  65 If you need a one page printable cheat sheet you can find it on:
  66 https://www.gnu.org/software/parallel/parallel_cheat.pdf
  67
  68 You can find a lot of B<EXAMPLE>s of use after the list of B<OPTIONS>
  69 in B<man parallel> (Use B<LESS=+/EXAMPLE: man parallel>). That will
  70 give you an idea of what GNU B<parallel> is capable of, and you may
  71 find a solution you can simply adapt to your situation.
  72
  73 If you want to dive even deeper: spend a couple of hours walking
  74 through the tutorial (B<man parallel_tutorial>). Your command line
  75 will love you for it.
  76
  77 Finally you may want to look at the rest of the manual (B<man
  78 parallel>) if you have special needs not already covered.
  79
  80 If you want to know the design decisions behind GNU B<parallel>, try:
  81 B<man parallel_design>. This is also a good intro if you intend to
  82 change GNU B<parallel>.
  83
  84
  85 =head1 OPTIONS
  86
  87 =over 4
  88
  89 =item I<command>
  90
  91 Command to execute.  If I<command> or the following arguments contain
  92 replacement strings (such as B<{}>) every instance will be substituted
  93 with the input.
  94
  95 If I<command> is given, GNU B<parallel> solve the same tasks as
  96 B<xargs>. If I<command> is not given GNU B<parallel> will behave
  97 similar to B<cat | sh>.
  98
  99 The I<command> must be an executable, a script, a composed command, an
 100 alias, or a function.
 101
 102 B<Bash functions>: B<export -f> the function first or use B<env_parallel>.
 103
 104 B<Bash, Csh, or Tcsh aliases>: Use B<env_parallel>.
 105
 106 B<Zsh, Fish, Ksh, and Pdksh functions and aliases>: Use B<env_parallel>.
 107
 108 =item B<{}>
 109
 110 Input line. This replacement string will be replaced by a full line
 111 read from the input source. The input source is normally stdin
 112 (standard input), but can also be given with B<-a>, B<:::>, or
 113 B<::::>.
 114
 115 The replacement string B<{}> can be changed with B<-I>.
 116
 117 If the command line contains no replacement strings then B<{}> will be
 118 appended to the command line.
 119
 120 Replacement strings are normally quoted, so special characters are not
 121 parsed by the shell. The exception is if the command starts with a
 122 replacement string; then the string is not quoted.
 123
 124
 125 =item B<{.}>
 126
 127 Input line without extension. This replacement string will be replaced
 128 by the input with the extension removed. If the input line contains
 129 B<.> after the last B</>, the last B<.> until the end of the string
 130 will be removed and B<{.}> will be replaced with the
 131 remaining. E.g. I<foo.jpg> becomes I<foo>, I<subdir/foo.jpg> becomes
 132 I<subdir/foo>, I<sub.dir/foo.jpg> becomes I<sub.dir/foo>,
 133 I<sub.dir/bar> remains I<sub.dir/bar>. If the input line does not
 134 contain B<.> it will remain unchanged.
 135
 136 The replacement string B<{.}> can be changed with B<--er>.
 137
 138 To understand replacement strings see B<{}>.
 139
 140
 141 =item B<{/}>
 142
 143 Basename of input line. This replacement string will be replaced by
 144 the input with the directory part removed.
 145
 146 The replacement string B<{/}> can be changed with
 147 B<--basenamereplace>.
 148
 149 To understand replacement strings see B<{}>.
 150
 151
 152 =item B<{//}>
 153
 154 Dirname of input line. This replacement string will be replaced by the
 155 dir of the input line. See B<dirname>(1).
 156
 157 The replacement string B<{//}> can be changed with
 158 B<--dirnamereplace>.
 159
 160 To understand replacement strings see B<{}>.
 161
 162
 163 =item B<{/.}>
 164
 165 Basename of input line without extension. This replacement string will
 166 be replaced by the input with the directory and extension part
 167 removed. It is a combination of B<{/}> and B<{.}>.
 168
 169 The replacement string B<{/.}> can be changed with
 170 B<--basenameextensionreplace>.
 171
 172 To understand replacement strings see B<{}>.
 173
 174
 175 =item B<{#}>
 176
 177 Sequence number of the job to run. This replacement string will be
 178 replaced by the sequence number of the job being run. It contains the
 179 same number as $PARALLEL_SEQ.
 180
 181 The replacement string B<{#}> can be changed with B<--seqreplace>.
 182
 183 To understand replacement strings see B<{}>.
 184
 185
 186 =item B<{%}>
 187
 188 Job slot number. This replacement string will be replaced by the job's
 189 slot number between 1 and number of jobs to run in parallel. There
 190 will never be 2 jobs running at the same time with the same job slot
 191 number.
 192
 193 The replacement string B<{%}> can be changed with B<--slotreplace>.
 194
 195 To understand replacement strings see B<{}>.
 196
 197
 198 =item B<{>I<n>B<}>
 199
 200 Argument from input source I<n> or the I<n>'th argument. This
 201 positional replacement string will be replaced by the input from input
 202 source I<n> (when used with B<-a> or B<::::>) or with the I<n>'th
 203 argument (when used with B<-N>). If I<n> is negative it refers to the
 204 I<n>'th last argument.
 205
 206 To understand replacement strings see B<{}>.
 207
 208
 209 =item B<{>I<n>.B<}>
 210
 211 Argument from input source I<n> or the I<n>'th argument without
 212 extension. It is a combination of B<{>I<n>B<}> and B<{.}>.
 213
 214 This positional replacement string will be replaced by the input from
 215 input source I<n> (when used with B<-a> or B<::::>) or with the
 216 I<n>'th argument (when used with B<-N>). The input will have the
 217 extension removed.
 218
 219 To understand positional replacement strings see B<{>I<n>B<}>.
 220
 221
 222 =item B<{>I<n>/B<}>
 223
 224 Basename of argument from input source I<n> or the I<n>'th argument.
 225 It is a combination of B<{>I<n>B<}> and B<{/}>.
 226
 227 This positional replacement string will be replaced by the input from
 228 input source I<n> (when used with B<-a> or B<::::>) or with the
 229 I<n>'th argument (when used with B<-N>). The input will have the
 230 directory (if any) removed.
 231
 232 To understand positional replacement strings see B<{>I<n>B<}>.
 233
 234
 235 =item B<{>I<n>//B<}>
 236
 237 Dirname of argument from input source I<n> or the I<n>'th argument.
 238 It is a combination of B<{>I<n>B<}> and B<{//}>.
 239
 240 This positional replacement string will be replaced by the dir of the
 241 input from input source I<n> (when used with B<-a> or B<::::>) or with
 242 the I<n>'th argument (when used with B<-N>). See B<dirname>(1).
 243
 244 To understand positional replacement strings see B<{>I<n>B<}>.
 245
 246
 247 =item B<{>I<n>/.B<}>
 248
 249 Basename of argument from input source I<n> or the I<n>'th argument
 250 without extension.  It is a combination of B<{>I<n>B<}>, B<{/}>, and
 251 B<{.}>.
 252
 253 This positional replacement string will be replaced by the input from
 254 input source I<n> (when used with B<-a> or B<::::>) or with the
 255 I<n>'th argument (when used with B<-N>). The input will have the
 256 directory (if any) and extension removed.
 257
 258 To understand positional replacement strings see B<{>I<n>B<}>.
 259
 260
 261 =item B<{=>I<perl expression>B<=}>
 262
 263 Replace with calculated I<perl expression>. B<$_> will contain the
 264 same as B<{}>. After evaluating I<perl expression> B<$_> will be used
 265 as the value. It is recommended to only change $_ but you have full
 266 access to all of GNU B<parallel>'s internal functions and data
 267 structures. A few convenience functions and data structures have been
 268 made:
 269
 270 =over 15
 271
 272 =item Z<> B<Q(>I<string>B<)>
 273
 274 shell quote a string
 275
 276 =item Z<> B<pQ(>I<string>B<)>
 277
 278 perl quote a string
 279
 280 =item Z<> B<total_jobs()>
 281
 282 number of jobs in total
 283
 284 =item Z<> B<slot()>
 285
 286 slot number of job
 287
 288 =item Z<> B<seq()>
 289
 290 sequence number of job
 291
 292 =item Z<> B<@arg>
 293
 294 the arguments
 295
 296 =back
 297
 298 Example:
 299
 300   seq 10 | parallel echo {} + 1 is {= '$_++' =}
 301   parallel csh -c {= '$_="mkdir ".Q($_)' =} ::: '12" dir'
 302   seq 50 | parallel echo job {#} of {= '$_=total_jobs()' =}
 303
 304 See also: B<--rpl> B<--parens>
 305
 306
 307 =item B<{=>I<n> I<perl expression>B<=}>
 308
 309 Positional equivalent to B<{=perl expression=}>. To understand
 310 positional replacement strings see B<{>I<n>B<}>.
 311
 312 See also: B<{=perl expression=}> B<{>I<n>B<}>.
 313
 314
 315 =item B<:::> I<arguments>
 316
 317 Use arguments from the command line as input source instead of stdin
 318 (standard input). Unlike other options for GNU B<parallel> B<:::> is
 319 placed after the I<command> and before the arguments.
 320
 321 The following are equivalent:
 322
 323   (echo file1; echo file2) | parallel gzip
 324   parallel gzip ::: file1 file2
 325   parallel gzip {} ::: file1 file2
 326   parallel --arg-sep ,, gzip {} ,, file1 file2
 327   parallel --arg-sep ,, gzip ,, file1 file2
 328   parallel ::: "gzip file1" "gzip file2"
 329
 330 To avoid treating B<:::> as special use B<--arg-sep> to set the
 331 argument separator to something else. See also B<--arg-sep>.
 332
 333 If multiple B<:::> are given, each group will be treated as an input
 334 source, and all combinations of input sources will be
 335 generated. E.g. ::: 1 2 ::: a b c will result in the combinations
 336 (1,a) (1,b) (1,c) (2,a) (2,b) (2,c). This is useful for replacing
 337 nested for-loops.
 338
 339 B<:::> and B<::::> can be mixed. So these are equivalent:
 340
 341   parallel echo {1} {2} {3} ::: 6 7 ::: 4 5 ::: 1 2 3
 342   parallel echo {1} {2} {3} :::: <(seq 6 7) <(seq 4 5) \
 343     :::: <(seq 1 3)
 344   parallel -a <(seq 6 7) echo {1} {2} {3} :::: <(seq 4 5) \
 345     :::: <(seq 1 3)
 346   parallel -a <(seq 6 7) -a <(seq 4 5) echo {1} {2} {3} \
 347     ::: 1 2 3
 348   seq 6 7 | parallel -a - -a <(seq 4 5) echo {1} {2} {3} \
 349     ::: 1 2 3
 350   seq 4 5 | parallel echo {1} {2} {3} :::: <(seq 6 7) - \
 351     ::: 1 2 3
 352
 353
 354 =item B<:::+> I<arguments>
 355
 356 Like B<:::> but linked like B<--link> to the previous input source.
 357
 358 Contrary to B<--link>, values do not wrap: The shortest input source
 359 determines the length.
 360
 361 Example:
 362
 363   parallel echo ::: a b c :::+ 1 2 3 ::: X Y :::+ 11 22
 364
 365
 366 =item B<::::> I<argfiles>
 367
 368 Another way to write B<-a> I<argfile1> B<-a> I<argfile2> ...
 369
 370 B<:::> and B<::::> can be mixed.
 371
 372 See B<-a>, B<:::> and B<--link>.
 373
 374
 375 =item B<::::+> I<argfiles>
 376
 377 Like B<::::> but linked like B<--link> to the previous input source.
 378
 379 Contrary to B<--link>, values do not wrap: The shortest input source
 380 determines the length.
 381
 382
 383 =item B<--null>
 384
 385 =item B<-0>
 386
 387 Use NUL as delimiter.  Normally input lines will end in \n
 388 (newline). If they end in \0 (NUL), then use this option. It is useful
 389 for processing arguments that may contain \n (newline).
 390
 391
 392 =item B<--arg-file> I<input-file>
 393
 394 =item B<-a> I<input-file>
 395
 396 Use I<input-file> as input source. If you use this option, stdin
 397 (standard input) is given to the first process run.  Otherwise, stdin
 398 (standard input) is redirected from /dev/null.
 399
 400 If multiple B<-a> are given, each I<input-file> will be treated as an
 401 input source, and all combinations of input sources will be
 402 generated. E.g. The file B<foo> contains B<1 2>, the file B<bar>
 403 contains B<a b c>.  B<-a foo> B<-a bar> will result in the combinations
 404 (1,a) (1,b) (1,c) (2,a) (2,b) (2,c). This is useful for replacing
 405 nested for-loops.
 406
 407 See also B<--link> and B<{>I<n>B<}>.
 408
 409
 410 =item B<--arg-file-sep> I<sep-str>
 411
 412 Use I<sep-str> instead of B<::::> as separator string between command
 413 and argument files. Useful if B<::::> is used for something else by the
 414 command.
 415
 416 See also: B<::::>.
 417
 418
 419 =item B<--arg-sep> I<sep-str>
 420
 421 Use I<sep-str> instead of B<:::> as separator string. Useful if B<:::>
 422 is used for something else by the command.
 423
 424 Also useful if you command uses B<:::> but you still want to read
 425 arguments from stdin (standard input): Simply change B<--arg-sep> to a
 426 string that is not in the command line.
 427
 428 See also: B<:::>.
 429
 430
 431 =item B<--bar>
 432
 433 Show progress as a progress bar. In the bar is shown: % of jobs
 434 completed, estimated seconds left, and number of jobs started.
 435
 436 It is compatible with B<zenity>:
 437
 438   seq 1000 | parallel -j30 --bar '(echo {};sleep 0.1)' \
 439     2> >(zenity --progress --auto-kill) | wc
 440
 441
 442 =item B<--basefile> I<file>
 443
 444 =item B<--bf> I<file>
 445
 446 I<file> will be transferred to each sshlogin before a job is
 447 started. It will be removed if B<--cleanup> is active. The file may be
 448 a script to run or some common base data needed for the job.
 449 Multiple B<--bf> can be specified to transfer more basefiles. The
 450 I<file> will be transferred the same way as B<--transferfile>.
 451
 452
 453 =item B<--basenamereplace> I<replace-str>
 454
 455 =item B<--bnr> I<replace-str>
 456
 457 Use the replacement string I<replace-str> instead of B<{/}> for
 458 basename of input line.
 459
 460
 461 =item B<--basenameextensionreplace> I<replace-str>
 462
 463 =item B<--bner> I<replace-str>
 464
 465 Use the replacement string I<replace-str> instead of B<{/.}> for basename of input line without extension.
 466
 467
 468 =item B<--bg>
 469
 470 Run command in background thus GNU B<parallel> will not wait for
 471 completion of the command before exiting. This is the default if
 472 B<--semaphore> is set.
 473
 474 See also: B<--fg>, B<man sem>.
 475
 476 Implies B<--semaphore>.
 477
 478
 479 =item B<--bibtex>
 480
 481 =item B<--citation>
 482
 483 Print the citation notice and BibTeX entry for GNU B<parallel>,
 484 silence citation notice for all future runs, and exit. It will not run
 485 any commands.
 486
 487 If it is impossible for you to run B<--citation> you can instead use
 488 B<--will-cite>, which will run commands, but which will only silence
 489 the citation notice for this single run.
 490
 491 If you use B<--will-cite> in scripts to be run by others you are
 492 making it harder for others to see the citation notice.  The
 493 development of GNU B<parallel> is indirectly financed through
 494 citations, so if your users do not know they should cite then you are
 495 making it harder to finance development. However, if you pay 10000
 496 EUR, you have done your part to finance future development and should
 497 feel free to use B<--will-cite> in scripts.
 498
 499 If you do not want to help financing future development by letting
 500 other users see the citation notice or by paying, then please use
 501 another tool instead of GNU B<parallel>. You can find some of the
 502 alternatives in B<man parallel_alternatives>.
 503
 504
 505 =item B<--block> I<size>
 506
 507 =item B<--block-size> I<size>
 508
 509 Size of block in bytes to read at a time. The I<size> can be postfixed
 510 with K, M, G, T, P, E, k, m, g, t, p, or e which would multiply the
 511 size with 1024, 1048576, 1073741824, 1099511627776, 1125899906842624,
 512 1152921504606846976, 1000, 1000000, 1000000000, 1000000000000,
 513 1000000000000000, or 1000000000000000000 respectively.
 514
 515 GNU B<parallel> tries to meet the block size but can be off by the
 516 length of one record. For performance reasons I<size> should be bigger
 517 than a two records. GNU B<parallel> will warn you and automatically
 518 increase the size if you choose a I<size> that is too small.
 519
 520 If you use B<-N>, B<--block-size> should be bigger than N+1 records.
 521
 522 I<size> defaults to 1M.
 523
 524 When using B<--pipepart> a negative block size is not interpreted as a
 525 blocksize but as the number of blocks each jobslot should have. So
 526 this will run 10*5 = 50 jobs in total:
 527
 528   parallel --pipepart -a myfile --block -10 -j5 wc
 529
 530 This is an efficient alternative to B<--roundrobin> because data is
 531 never read by GNU B<parallel>, but you can still have very few
 532 jobslots process a large amount of data.
 533
 534 See B<--pipe> and B<--pipepart> for use of this.
 535
 536
 537 =item B<--cat>
 538
 539 Create a temporary file with content. Normally B<--pipe>/B<--pipepart>
 540 will give data to the program on stdin (standard input). With B<--cat>
 541 GNU B<parallel> will create a temporary file with the name in B<{}>, so
 542 you can do: B<parallel --pipe --cat wc {}>.
 543
 544 Implies B<--pipe> unless B<--pipepart> is used.
 545
 546 See also B<--fifo>.
 547
 548
 549 =item B<--cleanup>
 550
 551 Remove transferred files. B<--cleanup> will remove the transferred
 552 files on the remote computer after processing is done.
 553
 554   find log -name '*gz' | parallel \
 555     --sshlogin server.example.com --transferfile {} \
 556     --return {.}.bz2 --cleanup "zcat {} | bzip -9 >{.}.bz2"
 557
 558 With B<--transferfile {}> the file transferred to the remote computer
 559 will be removed on the remote computer.  Directories created will not
 560 be removed - even if they are empty.
 561
 562 With B<--return> the file transferred from the remote computer will be
 563 removed on the remote computer.  Directories created will not be
 564 removed - even if they are empty.
 565
 566 B<--cleanup> is ignored when not used with B<--transferfile> or
 567 B<--return>.
 568
 569
 570 =item B<--colsep> I<regexp>
 571
 572 =item B<-C> I<regexp>
 573
 574 Column separator. The input will be treated as a table with I<regexp>
 575 separating the columns. The n'th column can be accessed using
 576 B<{>I<n>B<}> or B<{>I<n>.B<}>. E.g. B<{3}> is the 3rd column.
 577
 578 If there are more input sources, each input source will be separated,
 579 but the columns from each input source will be linked (see B<--link>).
 580
 581   parallel --colsep '-' echo {4} {3} {2} {1} \
 582     ::: A-B C-D ::: e-f g-h
 583
 584 B<--colsep> implies B<--trim rl>, which can be overridden with
 585 B<--trim n>.
 586
 587 I<regexp> is a Perl Regular Expression:
 588 http://perldoc.perl.org/perlre.html
 589
 590
 591 =item B<--compress>
 592
 593 Compress temporary files. If the output is big and very compressible
 594 this will take up less disk space in $TMPDIR and possibly be faster
 595 due to less disk I/O.
 596
 597 GNU B<parallel> will try B<pzstd>, B<lbzip2>, B<pbzip2>, B<zstd>,
 598 B<pigz>, B<lz4>, B<lzop>, B<plzip>, B<lzip>, B<lrz>, B<gzip>, B<pxz>,
 599 B<lzma>, B<bzip2>, B<xz>, B<clzip>, in that order, and use the first
 600 available.
 601
 602
 603 =item B<--compress-program> I<prg>
 604
 605 =item B<--decompress-program> I<prg>
 606
 607 Use I<prg> for (de)compressing temporary files. It is assumed that I<prg
 608 -dc> will decompress stdin (standard input) to stdout (standard
 609 output) unless B<--decompress-program> is given.
 610
 611
 612 =item B<--csv>
 613
 614 Treat input as CSV-format. B<--colsep> sets the field delimiter. It
 615 works very much like B<--colsep> except it deals correctly with
 616 quoting:
 617
 618    echo '"1 big, 2 small","2""x4"" plank",12.34' |
 619      parallel --csv echo {1} of {2} at {3}
 620
 621 Even quoted newlines are parsed correctly:
 622
 623    (echo '"Start of field 1 with newline'
 624     echo 'Line 2 in field 1";value 2') |
 625      parallel --csv --colsep ';' echo Field 1: {1} Field 2: {2}
 626
 627 When used with B<--pipe> only pass full CSV-records.
 628
 629
 630 =item B<--delimiter> I<delim>
 631
 632 =item B<-d> I<delim>
 633
 634 Input items are terminated by I<delim>.  Quotes and backslash are not
 635 special; every character in the input is taken literally.  Disables
 636 the end-of-file string, which is treated like any other argument. The
 637 specified delimiter may be characters, C-style character escapes such
 638 as \n, or octal or hexadecimal escape codes.  Octal and hexadecimal
 639 escape codes are understood as for the printf command.  Multibyte
 640 characters are not supported.
 641
 642
 643 =item B<--dirnamereplace> I<replace-str>
 644
 645 =item B<--dnr> I<replace-str>
 646
 647 Use the replacement string I<replace-str> instead of B<{//}> for
 648 dirname of input line.
 649
 650
 651 =item B<-E> I<eof-str>
 652
 653 Set the end of file string to I<eof-str>.  If the end of file string
 654 occurs as a line of input, the rest of the input is not read.  If
 655 neither B<-E> nor B<-e> is used, no end of file string is used.
 656
 657
 658 =item B<--delay> I<mytime>
 659
 660 Delay starting next job by I<mytime>. GNU B<parallel> will pause
 661 I<mytime> after starting each job. I<mytime> is normally in seconds,
 662 but can be floats postfixed with B<s>, B<m>, B<h>, or B<d> which would
 663 multiply the float by 1, 60, 3600, or 86400. Thus these are
 664 equivalent: B<--delay 100000> and B<--delay 1d3.5h16.6m4s>.
 665
 666
 667 =item B<--dry-run>
 668
 669 Print the job to run on stdout (standard output), but do not run the
 670 job. Use B<-v -v> to include the wrapping that GNU B<parallel>
 671 generates (for remote jobs, B<--tmux>, B<--nice>, B<--pipe>,
 672 B<--pipepart>, B<--fifo> and B<--cat>). Do not count on this
 673 literally, though, as the job may be scheduled on another computer or
 674 the local computer if : is in the list.
 675
 676
 677 =item B<--eof>[=I<eof-str>]
 678
 679 =item B<-e>[I<eof-str>]
 680
 681 This option is a synonym for the B<-E> option.  Use B<-E> instead,
 682 because it is POSIX compliant for B<xargs> while this option is not.
 683 If I<eof-str> is omitted, there is no end of file string.  If neither
 684 B<-E> nor B<-e> is used, no end of file string is used.
 685
 686
 687 =item B<--embed>
 688
 689 Embed GNU B<parallel> in a shell script. If you need to distribute your
 690 script to someone who does not want to install GNU B<parallel> you can
 691 embed GNU B<parallel> in your own shell script:
 692
 693   parallel --embed > new_script
 694
 695 After which you add your code at the end of B<new_script>. This is tested
 696 on B<ash>, B<bash>, B<dash>, B<ksh>, B<sh>, and B<zsh>.
 697
 698
 699 =item B<--env> I<var>
 700
 701 Copy environment variable I<var>. This will copy I<var> to the
 702 environment that the command is run in. This is especially useful for
 703 remote execution.
 704
 705 In Bash I<var> can also be a Bash function - just remember to B<export
 706 -f> the function, see B<command>.
 707
 708 The variable '_' is special. It will copy all exported environment
 709 variables except for the ones mentioned in ~/.parallel/ignored_vars.
 710
 711 To copy the full environment (both exported and not exported
 712 variables, arrays, and functions) use B<env_parallel>.
 713
 714 See also: B<--record-env>, B<--session>.
 715
 716
 717 =item B<--eta>
 718
 719 Show the estimated number of seconds before finishing. This forces GNU
 720 B<parallel> to read all jobs before starting to find the number of
 721 jobs. GNU B<parallel> normally only reads the next job to run.
 722
 723 The estimate is based on the runtime of finished jobs, so the first
 724 estimate will only be shown when the first job has finished.
 725
 726 Implies B<--progress>.
 727
 728 See also: B<--bar>, B<--progress>.
 729
 730
 731 =item B<--fg>
 732
 733 Run command in foreground.
 734
 735 With B<--tmux> and B<--tmuxpane> GNU B<parallel> will start B<tmux> in
 736 the foreground.
 737
 738 With B<--semaphore> GNU B<parallel> will run the command in the
 739 foreground (opposite B<--bg>), and wait for completion of the command
 740 before exiting.
 741
 742
 743 See also B<--bg>, B<man sem>.
 744
 745
 746 =item B<--fifo>
 747
 748 Create a temporary fifo with content. Normally B<--pipe> and
 749 B<--pipepart> will give data to the program on stdin (standard
 750 input). With B<--fifo> GNU B<parallel> will create a temporary fifo
 751 with the name in B<{}>, so you can do: B<parallel --pipe --fifo wc {}>.
 752
 753 Beware: If data is not read from the fifo, the job will block forever.
 754
 755 Implies B<--pipe> unless B<--pipepart> is used.
 756
 757 See also B<--cat>.
 758
 759
 760 =item B<--filter-hosts>
 761
 762 Remove down hosts. For each remote host: check that login through ssh
 763 works. If not: do not use this host.
 764
 765 For performance reasons, this check is performed only at the start and
 766 every time B<--sshloginfile> is changed. If an host goes down after
 767 the first check, it will go undetected until B<--sshloginfile> is
 768 changed; B<--retries> can be used to mitigate this.
 769
 770 Currently you can I<not> put B<--filter-hosts> in a profile,
 771 $PARALLEL, /etc/parallel/config or similar. This is because GNU
 772 B<parallel> uses GNU B<parallel> to compute this, so you will get an
 773 infinite loop. This will likely be fixed in a later release.
 774
 775
 776 =item B<--gnu>
 777
 778 Behave like GNU B<parallel>. This option historically took precedence
 779 over B<--tollef>. The B<--tollef> option is now retired, and therefore
 780 may not be used. B<--gnu> is kept for compatibility.
 781
 782
 783 =item B<--group>
 784
 785 Group output. Output from each job is grouped together and is only
 786 printed when the command is finished. Stdout (standard output) first
 787 followed by stderr (standard error).
 788
 789 This takes in the order of 0.5ms per job and depends on the speed of
 790 your disk for larger output. It can be disabled with B<-u>, but this
 791 means output from different commands can get mixed.
 792
 793 B<--group> is the default. Can be reversed with B<-u>.
 794
 795 See also: B<--line-buffer> B<--ungroup>
 796
 797
 798 =item B<--help>
 799
 800 =item B<-h>
 801
 802 Print a summary of the options to GNU B<parallel> and exit.
 803
 804
 805 =item B<--halt-on-error> I<val>
 806
 807 =item B<--halt> I<val>
 808
 809 When should GNU B<parallel> terminate? In some situations it makes no
 810 sense to run all jobs. GNU B<parallel> should simply give up as soon
 811 as a condition is met.
 812
 813 I<val> defaults to B<never>, which runs all jobs no matter what.
 814
 815 I<val> can also take on the form of I<when>,I<why>.
 816
 817 I<when> can be 'now' which means kill all running jobs and halt
 818 immediately, or it can be 'soon' which means wait for all running jobs
 819 to complete, but start no new jobs.
 820
 821 I<why> can be 'fail=X', 'fail=Y%', 'success=X', 'success=Y%',
 822 'done=X', or 'done=Y%' where X is the number of jobs that has to fail,
 823 succeed, or be done before halting, and Y is the percentage of jobs
 824 that has to fail, succeed, or be done before halting.
 825
 826 Example:
 827
 828 =over 23
 829
 830 =item Z<> --halt now,fail=1
 831
 832 exit when the first job fails. Kill running jobs.
 833
 834 =item Z<> --halt soon,fail=3
 835
 836 exit when 3 jobs fail, but wait for running jobs to complete.
 837
 838 =item Z<> --halt soon,fail=3%
 839
 840 exit when 3% of the jobs have failed, but wait for running jobs to complete.
 841
 842 =item Z<> --halt now,success=1
 843
 844 exit when a job succeeds. Kill running jobs.
 845
 846 =item Z<> --halt soon,success=3
 847
 848 exit when 3 jobs succeeds, but wait for running jobs to complete.
 849
 850 =item Z<> --halt now,success=3%
 851
 852 exit when 3% of the jobs have succeeded. Kill running jobs.
 853
 854 =item Z<> --halt now,done=1
 855
 856 exit when one of the jobs finishes. Kill running jobs.
 857
 858 =item Z<> --halt soon,done=3
 859
 860 exit when 3 jobs finishes, but wait for running jobs to complete.
 861
 862 =item Z<> --halt now,done=3%
 863
 864 exit when 3% of the jobs have finished. Kill running jobs.
 865
 866 =back
 867
 868 For backwards compatibility these also work:
 869
 870 =over 12
 871
 872 =item Z<>0
 873
 874 never
 875
 876 =item Z<>1
 877
 878 soon,fail=1
 879
 880 =item Z<>2
 881
 882 now,fail=1
 883
 884 =item Z<>-1
 885
 886 soon,success=1
 887
 888 =item Z<>-2
 889
 890 now,success=1
 891
 892 =item Z<>1-99%
 893
 894 soon,fail=1-99%
 895
 896 =back
 897
 898
 899 =item B<--header> I<regexp>
 900
 901 Use regexp as header. For normal usage the matched header (typically
 902 the first line: B<--header '.*\n'>) will be split using B<--colsep>
 903 (which will default to '\t') and column names can be used as
 904 replacement variables: B<{column name}>, B<{column name/}>, B<{column
 905 name//}>, B<{column name/.}>, B<{column name.}>, B<{=column name perl
 906 expression =}>, ..
 907
 908 For B<--pipe> the matched header will be prepended to each output.
 909
 910 B<--header :> is an alias for B<--header '.*\n'>.
 911
 912 If I<regexp> is a number, it is a fixed number of lines.
 913
 914
 915 =item B<--hostgroups>
 916
 917 =item B<--hgrp>
 918
 919 Enable hostgroups on arguments. If an argument contains '@' the string
 920 after '@' will be removed and treated as a list of hostgroups on which
 921 this job is allowed to run. If there is no B<--sshlogin> with a
 922 corresponding group, the job will run on any hostgroup.
 923
 924 Example:
 925
 926   parallel --hostgroups \
 927     --sshlogin @grp1/myserver1 -S @grp1+grp2/myserver2 \
 928     --sshlogin @grp3/myserver3 \
 929     echo ::: my_grp1_arg@grp1 arg_for_grp2@grp2 third@grp1+grp3
 930
 931 B<my_grp1_arg> may be run on either B<myserver1> or B<myserver2>,
 932 B<third> may be run on either B<myserver1> or B<myserver3>,
 933 but B<arg_for_grp2> will only be run on B<myserver2>.
 934
 935 See also: B<--sshlogin>.
 936
 937
 938 =item B<-I> I<replace-str>
 939
 940 Use the replacement string I<replace-str> instead of B<{}>.
 941
 942
 943 =item B<--replace>[=I<replace-str>]
 944
 945 =item B<-i>[I<replace-str>]
 946
 947 This option is a synonym for B<-I>I<replace-str> if I<replace-str> is
 948 specified, and for B<-I {}> otherwise.  This option is deprecated;
 949 use B<-I> instead.
 950
 951
 952 =item B<--joblog> I<logfile>
 953
 954 Logfile for executed jobs. Save a list of the executed jobs to
 955 I<logfile> in the following TAB separated format: sequence number,
 956 sshlogin, start time as seconds since epoch, run time in seconds,
 957 bytes in files transferred, bytes in files returned, exit status,
 958 signal, and command run.
 959
 960 For B<--pipe> bytes transferred and bytes returned are number of input
 961 and output of bytes.
 962
 963 If B<logfile> is prepended with '+' log lines will be appended to the
 964 logfile.
 965
 966 To convert the times into ISO-8601 strict do:
 967
 968   cat logfile | perl -a -F"\t" -ne \
 969     'chomp($F[2]=`date -d \@$F[2] +%FT%T`); print join("\t",@F)'
 970
 971 If the host is long, you can use B<column -t> to pretty print it:
 972
 973   cat joblog | column -t
 974
 975 See also B<--resume> B<--resume-failed>.
 976
 977
 978 =item B<--jobs> I<N>
 979
 980 =item B<-j> I<N>
 981
 982 =item B<--max-procs> I<N>
 983
 984 =item B<-P> I<N>
 985
 986 Number of jobslots on each machine. Run up to N jobs in parallel.  0
 987 means as many as possible. Default is 100% which will run one job per
 988 CPU on each machine.
 989
 990 If B<--semaphore> is set, the default is 1 thus making a mutex.
 991
 992
 993 =item B<--jobs> I<+N>
 994
 995 =item B<-j> I<+N>
 996
 997 =item B<--max-procs> I<+N>
 998
 999 =item B<-P> I<+N>
1000
1001 Add N to the number of CPUs.  Run this many jobs in parallel.  See
1002 also B<--use-cores-instead-of-threads> and
1003 B<--use-sockets-instead-of-threads>.
1004
1005
1006 =item B<--jobs> I<-N>
1007
1008 =item B<-j> I<-N>
1009
1010 =item B<--max-procs> I<-N>
1011
1012 =item B<-P> I<-N>
1013
1014 Subtract N from the number of CPUs.  Run this many jobs in parallel.
1015 If the evaluated number is less than 1 then 1 will be used.  See also
1016 B<--use-cores-instead-of-threads> and
1017 B<--use-sockets-instead-of-threads>.
1018
1019
1020 =item B<--jobs> I<N>%
1021
1022 =item B<-j> I<N>%
1023
1024 =item B<--max-procs> I<N>%
1025
1026 =item B<-P> I<N>%
1027
1028 Multiply N% with the number of CPUs.  Run this many jobs in
1029 parallel. See also B<--use-cores-instead-of-threads> and
1030 B<--use-sockets-instead-of-threads>.
1031
1032
1033 =item B<--jobs> I<procfile>
1034
1035 =item B<-j> I<procfile>
1036
1037 =item B<--max-procs> I<procfile>
1038
1039 =item B<-P> I<procfile>
1040
1041 Read parameter from file. Use the content of I<procfile> as parameter
1042 for I<-j>. E.g. I<procfile> could contain the string 100% or +2 or
1043 10. If I<procfile> is changed when a job completes, I<procfile> is
1044 read again and the new number of jobs is computed. If the number is
1045 lower than before, running jobs will be allowed to finish but new jobs
1046 will not be started until the wanted number of jobs has been reached.
1047 This makes it possible to change the number of simultaneous running
1048 jobs while GNU B<parallel> is running.
1049
1050
1051 =item B<--keep-order>
1052
1053 =item B<-k>
1054
1055 Keep sequence of output same as the order of input. Normally the
1056 output of a job will be printed as soon as the job completes. Try this
1057 to see the difference:
1058
1059   parallel -j4 sleep {}\; echo {} ::: 2 1 4 3
1060   parallel -j4 -k sleep {}\; echo {} ::: 2 1 4 3
1061
1062 If used with B<--onall> or B<--nonall> the output will grouped by
1063 sshlogin in sorted order.
1064
1065 If used with B<--pipe --roundrobin> and the same input, the jobslots
1066 will get the same blocks in the same order in every run.
1067
1068 B<-k> only affects the order in which the output is printed - not the
1069 order in which jobs are run.
1070
1071
1072 =item B<-L> I<recsize>
1073
1074 When used with B<--pipe>: Read records of I<recsize>.
1075
1076 When used otherwise: Use at most I<recsize> nonblank input lines per
1077 command line.  Trailing blanks cause an input line to be logically
1078 continued on the next input line.
1079
1080 B<-L 0> means read one line, but insert 0 arguments on the command
1081 line.
1082
1083 Implies B<-X> unless B<-m>, B<--xargs>, or B<--pipe> is set.
1084
1085
1086 =item B<--max-lines>[=I<recsize>]
1087
1088 =item B<-l>[I<recsize>]
1089
1090 When used with B<--pipe>: Read records of I<recsize> lines.
1091
1092 When used otherwise: Synonym for the B<-L> option.  Unlike B<-L>, the
1093 I<recsize> argument is optional.  If I<recsize> is not specified,
1094 it defaults to one.  The B<-l> option is deprecated since the POSIX
1095 standard specifies B<-L> instead.
1096
1097 B<-l 0> is an alias for B<-l 1>.
1098
1099 Implies B<-X> unless B<-m>, B<--xargs>, or B<--pipe> is set.
1100
1101
1102 =item B<--limit> "I<command> I<args>"
1103
1104 Dynamic job limit. Before starting a new job run I<command> with
1105 I<args>. The exit value of I<command> determines what GNU B<parallel>
1106 will do:
1107
1108 =over 4
1109
1110 =item Z<>0
1111
1112 Below limit. Start another job.
1113
1114 =item Z<>1
1115
1116 Over limit. Start no jobs.
1117
1118 =item Z<>2
1119
1120 Way over limit. Kill the youngest job.
1121
1122 =back
1123
1124 You can use any shell command. There are 3 predefined commands:
1125
1126 =over 10
1127
1128 =item "io I<n>"
1129
1130 Limit for I/O. The amount of disk I/O will be computed as a value
1131 0-100, where 0 is no I/O and 100 is at least one disk is 100%
1132 saturated.
1133
1134 =item "load I<n>"
1135
1136 Similar to B<--load>.
1137
1138 =item "mem I<n>"
1139
1140 Similar to B<--memfree>.
1141
1142 =back
1143
1144
1145 =item B<--line-buffer>
1146
1147 =item B<--lb>
1148
1149 Buffer output on line basis. B<--group> will keep the output together
1150 for a whole job. B<--ungroup> allows output to mixup with half a line
1151 coming from one job and half a line coming from another
1152 job. B<--line-buffer> fits between these two: GNU B<parallel> will
1153 print a full line, but will allow for mixing lines of different jobs.
1154
1155 B<--line-buffer> takes more CPU power than both B<--group> and
1156 B<--ungroup>, but can be much faster than B<--group> if the CPU is not
1157 the limiting factor.
1158
1159 Normally B<--line-buffer> does not buffer on disk, and can thus
1160 process an infinite amount of data, but it will buffer on disk when
1161 combined with: B<--keep-order>, B<--results>, B<--compress>, and
1162 B<--files>. This will make it as slow as B<--group> and will limit
1163 output to the available disk space.
1164
1165 With B<--keep-order> B<--line-buffer> will output lines from the first
1166 job while it is running, then lines from the second job while that is
1167 running. It will buffer full lines, but jobs will not mix. Compare:
1168
1169   parallel -j0 'echo {};sleep {};echo {}' ::: 1 3 2 4
1170   parallel -j0 --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1171   parallel -j0 -k --lb 'echo {};sleep {};echo {}' ::: 1 3 2 4
1172
1173 See also: B<--group> B<--ungroup>
1174
1175
1176 =item B<--xapply>
1177
1178 =item B<--link>
1179
1180 Link input sources. Read multiple input sources like B<xapply>. If
1181 multiple input sources are given, one argument will be read from each
1182 of the input sources. The arguments can be accessed in the command as
1183 B<{1}> .. B<{>I<n>B<}>, so B<{1}> will be a line from the first input
1184 source, and B<{6}> will refer to the line with the same line number
1185 from the 6th input source.
1186
1187 Compare these two:
1188
1189   parallel echo {1} {2} ::: 1 2 3 ::: a b c
1190   parallel --link echo {1} {2} ::: 1 2 3 ::: a b c
1191
1192 Arguments will be recycled if one input source has more arguments than the others:
1193
1194   parallel --link echo {1} {2} {3} \
1195     ::: 1 2 ::: I II III ::: a b c d e f g
1196
1197 See also B<--header>, B<:::+>, B<::::+>.
1198
1199
1200 =item B<--load> I<max-load>
1201
1202 Do not start new jobs on a given computer unless the number of running
1203 processes on the computer is less than I<max-load>. I<max-load> uses
1204 the same syntax as B<--jobs>, so I<100%> for one per CPU is a valid
1205 setting. Only difference is 0 which is interpreted as 0.01.
1206
1207
1208 =item B<--controlmaster>
1209
1210 =item B<-M>
1211
1212 Use ssh's ControlMaster to make ssh connections faster. Useful if jobs
1213 run remote and are very fast to run. This is disabled for sshlogins
1214 that specify their own ssh command.
1215
1216
1217 =item B<--xargs>
1218
1219 Multiple arguments. Insert as many arguments as the command line
1220 length permits.
1221
1222 If B<{}> is not used the arguments will be appended to the
1223 line.  If B<{}> is used multiple times each B<{}> will be replaced
1224 with all the arguments.
1225
1226 Support for B<--xargs> with B<--sshlogin> is limited and may fail.
1227
1228 See also B<-X> for context replace. If in doubt use B<-X> as that will
1229 most likely do what is needed.
1230
1231
1232 =item B<-m>
1233
1234 Multiple arguments. Insert as many arguments as the command line
1235 length permits. If multiple jobs are being run in parallel: distribute
1236 the arguments evenly among the jobs. Use B<-j1> or B<--xargs> to avoid this.
1237
1238 If B<{}> is not used the arguments will be appended to the
1239 line.  If B<{}> is used multiple times each B<{}> will be replaced
1240 with all the arguments.
1241
1242 Support for B<-m> with B<--sshlogin> is limited and may fail.
1243
1244 See also B<-X> for context replace. If in doubt use B<-X> as that will
1245 most likely do what is needed.
1246
1247
1248 =item B<--memfree> I<size>
1249
1250 Minimum memory free when starting another job. The I<size> can be
1251 postfixed with K, M, G, T, P, k, m, g, t, or p which would multiply
1252 the size with 1024, 1048576, 1073741824, 1099511627776,
1253 1125899906842624, 1000, 1000000, 1000000000, 1000000000000, or
1254 1000000000000000, respectively.
1255
1256 If the jobs take up very different amount of RAM, GNU B<parallel> will
1257 only start as many as there is memory for. If less than I<size> bytes
1258 are free, no more jobs will be started. If less than 50% I<size> bytes
1259 are free, the youngest job will be killed, and put back on the queue
1260 to be run later.
1261
1262 B<--retries> must be set to determine how many times GNU B<parallel>
1263 should retry a given job.
1264
1265
1266 =item B<--minversion> I<version>
1267
1268 Print the version GNU B<parallel> and exit.  If the current version of
1269 GNU B<parallel> is less than I<version> the exit code is
1270 255. Otherwise it is 0.
1271
1272 This is useful for scripts that depend on features only available from
1273 a certain version of GNU B<parallel>.
1274
1275
1276 =item B<--nonall>
1277
1278 B<--onall> with no arguments. Run the command on all computers given
1279 with B<--sshlogin> but take no arguments. GNU B<parallel> will log
1280 into B<--jobs> number of computers in parallel and run the job on the
1281 computer. B<-j> adjusts how many computers to log into in parallel.
1282
1283 This is useful for running the same command (e.g. uptime) on a list of
1284 servers.
1285
1286
1287 =item B<--onall>
1288
1289 Run all the jobs on all computers given with B<--sshlogin>. GNU
1290 B<parallel> will log into B<--jobs> number of computers in parallel
1291 and run one job at a time on the computer. The order of the jobs will
1292 not be changed, but some computers may finish before others.
1293
1294 When using B<--group> the output will be grouped by each server, so
1295 all the output from one server will be grouped together.
1296
1297 B<--joblog> will contain an entry for each job on each server, so
1298 there will be several job sequence 1.
1299
1300
1301 =item B<--output-as-files>
1302
1303 =item B<--outputasfiles>
1304
1305 =item B<--files>
1306
1307 Instead of printing the output to stdout (standard output) the output
1308 of each job is saved in a file and the filename is then printed.
1309
1310 See also: B<--results>
1311
1312
1313 =item B<--pipe>
1314
1315 =item B<--spreadstdin>
1316
1317 Spread input to jobs on stdin (standard input). Read a block of data
1318 from stdin (standard input) and give one block of data as input to one
1319 job.
1320
1321 The block size is determined by B<--block>. The strings B<--recstart>
1322 and B<--recend> tell GNU B<parallel> how a record starts and/or
1323 ends. The block read will have the final partial record removed before
1324 the block is passed on to the job. The partial record will be
1325 prepended to next block.
1326
1327 If B<--recstart> is given this will be used to split at record start.
1328
1329 If B<--recend> is given this will be used to split at record end.
1330
1331 If both B<--recstart> and B<--recend> are given both will have to
1332 match to find a split position.
1333
1334 If neither B<--recstart> nor B<--recend> are given B<--recend>
1335 defaults to '\n'. To have no record separator use B<--recend "">.
1336
1337 B<--files> is often used with B<--pipe>.
1338
1339 B<--pipe> maxes out at around 1 GB/s input, and 100 MB/s output. If
1340 performance is important use B<--pipepart>.
1341
1342 See also: B<--recstart>, B<--recend>, B<--fifo>, B<--cat>,
1343 B<--pipepart>, B<--files>.
1344
1345
1346 =item B<--pipepart>
1347
1348 Pipe parts of a physical file. B<--pipepart> works similar to
1349 B<--pipe>, but is much faster.
1350
1351 B<--pipepart> has a few limitations:
1352
1353 =over 3
1354
1355 =item *
1356
1357 The file must be a normal file or a block device (technically it must
1358 be seekable) and must be given using B<-a> or B<::::>. The file cannot
1359 be a pipe or a fifo as they are not seekable.
1360
1361 If using a block device with lot of NUL bytes, remember to set
1362 B<--recend ''>.
1363
1364 =item *
1365
1366 Record counting (B<-N>) and line counting (B<-L>/B<-l>) do not work.
1367
1368 =back
1369
1370
1371 =item B<--plain>
1372
1373 Ignore any B<--profile>, $PARALLEL, and ~/.parallel/config to get full
1374 control on the command line (used by GNU B<parallel> internally when
1375 called with B<--sshlogin>).
1376
1377
1378 =item B<--plus>
1379
1380 Activate additional replacement strings: {+/} {+.} {+..} {+...} {..}
1381 {...} {/..} {/...} {##}. The idea being that '{+foo}' matches the opposite of
1382 '{foo}' and {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} =
1383 {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}
1384
1385 B<{##}> is the number of jobs to be run. It is incompatible with
1386 B<-X>/B<-m>/B<--xargs>.
1387
1388 B<{choose_k}> is inspired by n choose k: Given a list of n elements,
1389 choose k. k is the number of input sources and n is the number of
1390 arguments in an input source.  The content of the input sources must
1391 be the same and the arguments must be unique.
1392
1393 The following dynamic replacement strings are also activated. They are
1394 inspired by bash's parameter expansion:
1395
1396   {:-str}       str if the value is empty
1397   {:num}        remove the first num characters
1398   {:num1:num2}  characters from num1 to num2
1399   {#str}        remove prefix str
1400   {%str}        remove postfix str
1401   {/str1/str2}  replace str1 with str2
1402   {^str}        uppercase str if found at the start
1403   {^^str}       uppercase str
1404   {,str}        lowercase str if found at the start
1405   {,,str}       lowercase str
1406
1407
1408 =item B<--progress>
1409
1410 Show progress of computations. List the computers involved in the task
1411 with number of CPUs detected and the max number of jobs to run. After
1412 that show progress for each computer: number of running jobs, number
1413 of completed jobs, and percentage of all jobs done by this
1414 computer. The percentage will only be available after all jobs have
1415 been scheduled as GNU B<parallel> only read the next job when ready to
1416 schedule it - this is to avoid wasting time and memory by reading
1417 everything at startup.
1418
1419 By sending GNU B<parallel> SIGUSR2 you can toggle turning on/off
1420 B<--progress> on a running GNU B<parallel> process.
1421
1422 See also B<--eta> and B<--bar>.
1423
1424
1425 =item B<--max-args>=I<max-args>
1426
1427 =item B<-n> I<max-args>
1428
1429 Use at most I<max-args> arguments per command line.  Fewer than
1430 I<max-args> arguments will be used if the size (see the B<-s> option)
1431 is exceeded, unless the B<-x> option is given, in which case
1432 GNU B<parallel> will exit.
1433
1434 B<-n 0> means read one argument, but insert 0 arguments on the command
1435 line.
1436
1437 Implies B<-X> unless B<-m> is set.
1438
1439
1440 =item B<--max-replace-args>=I<max-args>
1441
1442 =item B<-N> I<max-args>
1443
1444 Use at most I<max-args> arguments per command line. Like B<-n> but
1445 also makes replacement strings B<{1}> .. B<{>I<max-args>B<}> that
1446 represents argument 1 .. I<max-args>. If too few args the B<{>I<n>B<}> will
1447 be empty.
1448
1449 B<-N 0> means read one argument, but insert 0 arguments on the command
1450 line.
1451
1452 This will set the owner of the homedir to the user:
1453
1454   tr ':' '\n' < /etc/passwd | parallel -N7 chown {1} {6}
1455
1456 Implies B<-X> unless B<-m> or B<--pipe> is set.
1457
1458 When used with B<--pipe> B<-N> is the number of records to read. This
1459 is somewhat slower than B<--block>.
1460
1461
1462 =item B<--max-line-length-allowed>
1463
1464 Print the maximal number of characters allowed on the command line and
1465 exit (used by GNU B<parallel> itself to determine the line length
1466 on remote computers).
1467
1468
1469 =item B<--number-of-cpus> (obsolete)
1470
1471 Print the number of physical CPU cores and exit.
1472
1473
1474 =item B<--number-of-cores>
1475
1476 Print the number of physical CPU cores and exit (used by GNU B<parallel> itself
1477 to determine the number of physical CPU cores on remote computers).
1478
1479
1480 =item B<--number-of-sockets>
1481
1482 Print the number of filled CPU sockets and exit (used by GNU
1483 B<parallel> itself to determine the number of filled CPU sockets on
1484 remote computers).
1485
1486
1487 =item B<--number-of-threads>
1488
1489 Print the number of hyperthreaded CPU cores and exit (used by GNU
1490 B<parallel> itself to determine the number of hyperthreaded CPU cores
1491 on remote computers).
1492
1493
1494 =item B<--no-keep-order>
1495
1496 Overrides an earlier B<--keep-order> (e.g. if set in
1497 B<~/.parallel/config>).
1498
1499
1500 =item B<--nice> I<niceness>
1501
1502 Run the command at this niceness. For simple commands you can just add
1503 B<nice> in front of the command. But if the command consists of more
1504 sub commands (Like: ls|wc) then prepending B<nice> will not always
1505 work. B<--nice> will make sure all sub commands are niced - even on
1506 remote servers.
1507
1508
1509 =item B<--interactive>
1510
1511 =item B<-p>
1512
1513 Prompt the user about whether to run each command line and read a line
1514 from the terminal.  Only run the command line if the response starts
1515 with 'y' or 'Y'.  Implies B<-t>.
1516
1517
1518 =item B<--parens> I<parensstring>
1519
1520 Define start and end parenthesis for B<{= perl expression =}>. The
1521 left and the right parenthesis can be multiple characters and are
1522 assumed to be the same length. The default is B<{==}> giving B<{=> as
1523 the start parenthesis and B<=}> as the end parenthesis.
1524
1525 Another useful setting is B<,,,,> which would make both parenthesis
1526 B<,,>:
1527
1528   parallel --parens ,,,, echo foo is ,,s/I/O/g,, ::: FII
1529
1530 See also: B<--rpl> B<{= perl expression =}>
1531
1532
1533 =item B<--profile> I<profilename>
1534
1535 =item B<-J> I<profilename>
1536
1537 Use profile I<profilename> for options. This is useful if you want to
1538 have multiple profiles. You could have one profile for running jobs in
1539 parallel on the local computer and a different profile for running jobs
1540 on remote computers. See the section PROFILE FILES for examples.
1541
1542 I<profilename> corresponds to the file ~/.parallel/I<profilename>.
1543
1544 You can give multiple profiles by repeating B<--profile>. If parts of
1545 the profiles conflict, the later ones will be used.
1546
1547 Default: config
1548
1549
1550 =item B<--quote>
1551
1552 =item B<-q>
1553
1554 Quote I<command>. The command must be a simple command (see B<man
1555 bash>) without redirections and without variable assignments. This
1556 will quote the command line and arguments so special characters are
1557 not interpreted by the shell. See the section QUOTING. Most people
1558 will never need this.  Quoting is disabled by default.
1559
1560
1561 =item B<--no-run-if-empty>
1562
1563 =item B<-r>
1564
1565 If the stdin (standard input) only contains whitespace, do not run the command.
1566
1567 If used with B<--pipe> this is slow.
1568
1569
1570 =item B<--noswap>
1571
1572 Do not start new jobs on a given computer if there is both swap-in and
1573 swap-out activity.
1574
1575 The swap activity is only sampled every 10 seconds as the sampling
1576 takes 1 second to do.
1577
1578 Swap activity is computed as (swap-in)*(swap-out) which in practice is
1579 a good value: swapping out is not a problem, swapping in is not a
1580 problem, but both swapping in and out usually indicates a problem.
1581
1582 B<--memfree> may give better results, so try using that first.
1583
1584
1585 =item B<--record-env>
1586
1587 Record current environment variables in ~/.parallel/ignored_vars. This
1588 is useful before using B<--env _>.
1589
1590 See also B<--env>, B<--session>.
1591
1592
1593 =item B<--recstart> I<startstring>
1594
1595 =item B<--recend> I<endstring>
1596
1597 If B<--recstart> is given I<startstring> will be used to split at record start.
1598
1599 If B<--recend> is given I<endstring> will be used to split at record end.
1600
1601 If both B<--recstart> and B<--recend> are given the combined string
1602 I<endstring>I<startstring> will have to match to find a split
1603 position. This is useful if either I<startstring> or I<endstring>
1604 match in the middle of a record.
1605
1606 If neither B<--recstart> nor B<--recend> are given then B<--recend>
1607 defaults to '\n'. To have no record separator use B<--recend "">.
1608
1609 B<--recstart> and B<--recend> are used with B<--pipe>.
1610
1611 Use B<--regexp> to interpret B<--recstart> and B<--recend> as regular
1612 expressions. This is slow, however.
1613
1614
1615 =item B<--regexp>
1616
1617 Use B<--regexp> to interpret B<--recstart> and B<--recend> as regular
1618 expressions. This is slow, however.
1619
1620
1621 =item B<--remove-rec-sep>
1622
1623 =item B<--removerecsep>
1624
1625 =item B<--rrs>
1626
1627 Remove the text matched by B<--recstart> and B<--recend> before piping
1628 it to the command.
1629
1630 Only used with B<--pipe>.
1631
1632
1633 =item B<--results> I<name>
1634
1635 =item B<--res> I<name>
1636
1637 Save the output into files.
1638
1639 B<Simple string output dir>
1640
1641 If I<name> does not contain replacement strings and does not end in
1642 B<.csv/.tsv>, the output will be stored in a directory tree rooted at
1643 I<name>.  Within this directory tree, each command will result in
1644 three files: I<name>/<ARGS>/stdout and I<name>/<ARGS>/stderr,
1645 I<name>/<ARGS>/seq, where <ARGS> is a sequence of directories
1646 representing the header of the input source (if using B<--header :>)
1647 or the number of the input source and corresponding values.
1648
1649 E.g:
1650
1651   parallel --header : --results foo echo {a} {b} \
1652     ::: a I II ::: b III IIII
1653
1654 will generate the files:
1655
1656   foo/a/II/b/III/seq
1657   foo/a/II/b/III/stderr
1658   foo/a/II/b/III/stdout
1659   foo/a/II/b/IIII/seq
1660   foo/a/II/b/IIII/stderr
1661   foo/a/II/b/IIII/stdout
1662   foo/a/I/b/III/seq
1663   foo/a/I/b/III/stderr
1664   foo/a/I/b/III/stdout
1665   foo/a/I/b/IIII/seq
1666   foo/a/I/b/IIII/stderr
1667   foo/a/I/b/IIII/stdout
1668
1669 and
1670
1671   parallel --results foo echo {1} {2} ::: I II ::: III IIII
1672
1673 will generate the files:
1674
1675   foo/1/II/2/III/seq
1676   foo/1/II/2/III/stderr
1677   foo/1/II/2/III/stdout
1678   foo/1/II/2/IIII/seq
1679   foo/1/II/2/IIII/stderr
1680   foo/1/II/2/IIII/stdout
1681   foo/1/I/2/III/seq
1682   foo/1/I/2/III/stderr
1683   foo/1/I/2/III/stdout
1684   foo/1/I/2/IIII/seq
1685   foo/1/I/2/IIII/stderr
1686   foo/1/I/2/IIII/stdout
1687
1688
1689 B<CSV file output>
1690
1691 If I<name> ends in B<.csv>/B<.tsv> the output will be a CSV-file
1692 named I<name>.
1693
1694 B<.csv> gives a comma separated value file. B<.tsv> gives a TAB
1695 separated value file.
1696
1697 B<-.csv>/B<-.tsv> are special: It will give the file on stdout
1698 (standard output).
1699
1700
1701 B<Replacement string output file>
1702
1703 If I<name> contains a replacement string and the replaced result does
1704 not end in /, then the standard output will be stored in a file named
1705 by this result. Standard error will be stored in the same file name
1706 with '.err' added, and the sequence number will be stored in the same
1707 file name with '.seq' added.
1708
1709 E.g.
1710
1711   parallel --results my_{} echo ::: foo bar baz
1712
1713 will generate the files:
1714
1715   my_bar
1716   my_bar.err
1717   my_bar.seq
1718   my_baz
1719   my_baz.err
1720   my_baz.seq
1721   my_foo
1722   my_foo.err
1723   my_foo.seq
1724
1725
1726 B<Replacement string output dir>
1727
1728 If I<name> contains a replacement string and the replaced result ends
1729 in /, then output files will be stored in the resulting dir.
1730
1731 E.g.
1732
1733   parallel --results my_{}/ echo ::: foo bar baz
1734
1735 will generate the files:
1736
1737   my_bar/seq
1738   my_bar/stderr
1739   my_bar/stdout
1740   my_baz/seq
1741   my_baz/stderr
1742   my_baz/stdout
1743   my_foo/seq
1744   my_foo/stderr
1745   my_foo/stdout
1746
1747 See also B<--files>, B<--tag>, B<--header>, B<--joblog>.
1748
1749
1750 =item B<--resume>
1751
1752 Resumes from the last unfinished job. By reading B<--joblog> or the
1753 B<--results> dir GNU B<parallel> will figure out the last unfinished
1754 job and continue from there. As GNU B<parallel> only looks at the
1755 sequence numbers in B<--joblog> then the input, the command, and
1756 B<--joblog> all have to remain unchanged; otherwise GNU B<parallel>
1757 may run wrong commands.
1758
1759 See also B<--joblog>, B<--results>, B<--resume-failed>, B<--retries>.
1760
1761
1762 =item B<--resume-failed>
1763
1764 Retry all failed and resume from the last unfinished job. By reading
1765 B<--joblog> GNU B<parallel> will figure out the failed jobs and run
1766 those again. After that it will resume last unfinished job and
1767 continue from there. As GNU B<parallel> only looks at the sequence
1768 numbers in B<--joblog> then the input, the command, and B<--joblog>
1769 all have to remain unchanged; otherwise GNU B<parallel> may run wrong
1770 commands.
1771
1772 See also B<--joblog>, B<--resume>, B<--retry-failed>, B<--retries>.
1773
1774
1775 =item B<--retry-failed>
1776
1777 Retry all failed jobs in joblog. By reading B<--joblog> GNU
1778 B<parallel> will figure out the failed jobs and run those again.
1779
1780 B<--retry-failed> ignores the command and arguments on the command
1781 line: It only looks at the joblog.
1782
1783 B<Differences between --resume, --resume-failed, --retry-failed>
1784
1785 In this example B<exit {= $_%=2 =}> will cause every other job to fail.
1786
1787   timeout -k 1 4 parallel --joblog log -j10 \
1788     'sleep {}; exit {= $_%=2 =}' ::: {10..1}
1789
1790 4 jobs completed. 2 failed:
1791
1792   Seq   [...]   Exitval Signal  Command
1793   10    [...]   1       0       sleep 1; exit 1
1794   9     [...]   0       0       sleep 2; exit 0
1795   8     [...]   1       0       sleep 3; exit 1
1796   7     [...]   0       0       sleep 4; exit 0
1797
1798 B<--resume> does not care about the Exitval, but only looks at Seq. If
1799 the Seq is run, it will not be run again. So if needed, you can change
1800 the command for the seqs not run yet:
1801
1802   parallel --resume --joblog log -j10 \
1803     'sleep .{}; exit {= $_%=2 =}' ::: {10..1}
1804
1805   Seq   [...]   Exitval Signal  Command
1806   [... as above ...]
1807   1     [...]   0       0       sleep .10; exit 0
1808   6     [...]   1       0       sleep .5; exit 1
1809   5     [...]   0       0       sleep .6; exit 0
1810   4     [...]   1       0       sleep .7; exit 1
1811   3     [...]   0       0       sleep .8; exit 0
1812   2     [...]   1       0       sleep .9; exit 1
1813
1814 B<--resume-failed> cares about the Exitval, but also only looks at Seq
1815 to figure out which commands to run. Again this means you can change
1816 the command, but not the arguments. It will run the failed seqs and
1817 the seqs not yet run:
1818
1819   parallel --resume-failed --joblog log -j10 \
1820     'echo {};sleep .{}; exit {= $_%=3 =}' ::: {10..1}
1821
1822   Seq   [...]   Exitval Signal  Command
1823   [... as above ...]
1824   10    [...]   1       0       echo 1;sleep .1; exit 1
1825   8     [...]   0       0       echo 3;sleep .3; exit 0
1826   6     [...]   2       0       echo 5;sleep .5; exit 2
1827   4     [...]   1       0       echo 7;sleep .7; exit 1
1828   2     [...]   0       0       echo 9;sleep .9; exit 0
1829
1830 B<--retry-failed> cares about the Exitval, but takes the command from
1831 the joblog. It ignores any arguments or commands given on the command
1832 line:
1833
1834   parallel --retry-failed --joblog log -j10 this part is ignored
1835
1836   Seq   [...]   Exitval Signal  Command
1837   [... as above ...]
1838   10    [...]   1       0       echo 1;sleep .1; exit 1
1839   6     [...]   2       0       echo 5;sleep .5; exit 2
1840   4     [...]   1       0       echo 7;sleep .7; exit 1
1841
1842 See also B<--joblog>, B<--resume>, B<--resume-failed>, B<--retries>.
1843
1844
1845 =item B<--retries> I<n>
1846
1847 If a job fails, retry it on another computer on which it has not
1848 failed. Do this I<n> times. If there are fewer than I<n> computers in
1849 B<--sshlogin> GNU B<parallel> will re-use all the computers. This is
1850 useful if some jobs fail for no apparent reason (such as network
1851 failure).
1852
1853
1854 =item B<--return> I<filename>
1855
1856 Transfer files from remote computers. B<--return> is used with
1857 B<--sshlogin> when the arguments are files on the remote computers. When
1858 processing is done the file I<filename> will be transferred
1859 from the remote computer using B<rsync> and will be put relative to
1860 the default login dir. E.g.
1861
1862   echo foo/bar.txt | parallel --return {.}.out \
1863     --sshlogin server.example.com touch {.}.out
1864
1865 This will transfer the file I<$HOME/foo/bar.out> from the computer
1866 I<server.example.com> to the file I<foo/bar.out> after running
1867 B<touch foo/bar.out> on I<server.example.com>.
1868
1869   parallel -S server --trc out/./{}.out touch {}.out ::: in/file
1870
1871 This will transfer the file I<in/file.out> from the computer
1872 I<server.example.com> to the files I<out/in/file.out> after running
1873 B<touch in/file.out> on I<server>.
1874
1875   echo /tmp/foo/bar.txt | parallel --return {.}.out \
1876     --sshlogin server.example.com touch {.}.out
1877
1878 This will transfer the file I</tmp/foo/bar.out> from the computer
1879 I<server.example.com> to the file I</tmp/foo/bar.out> after running
1880 B<touch /tmp/foo/bar.out> on I<server.example.com>.
1881
1882 Multiple files can be transferred by repeating the option multiple
1883 times:
1884
1885   echo /tmp/foo/bar.txt | parallel \
1886     --sshlogin server.example.com \
1887     --return {.}.out --return {.}.out2 touch {.}.out {.}.out2
1888
1889 B<--return> is often used with B<--transferfile> and B<--cleanup>.
1890
1891 B<--return> is ignored when used with B<--sshlogin :> or when not used
1892 with B<--sshlogin>.
1893
1894
1895 =item B<--round-robin>
1896
1897 =item B<--round>
1898
1899 Normally B<--pipe> will give a single block to each instance of the
1900 command. With B<--roundrobin> all blocks will at random be written to
1901 commands already running. This is useful if the command takes a long
1902 time to initialize.
1903
1904 B<--keep-order> will not work with B<--roundrobin> as it is
1905 impossible to track which input block corresponds to which output.
1906
1907 B<--roundrobin> implies B<--pipe>, except if B<--pipepart> is given.
1908
1909
1910 =item B<--rpl> 'I<tag> I<perl expression>'
1911
1912 Use I<tag> as a replacement string for I<perl expression>. This makes
1913 it possible to define your own replacement strings. GNU B<parallel>'s
1914 7 replacement strings are implemented as:
1915
1916   --rpl '{} '
1917   --rpl '{#} 1 $_=$job->seq()'
1918   --rpl '{%} 1 $_=$job->slot()'
1919   --rpl '{/} s:.*/::'
1920   --rpl '{//} $Global::use{"File::Basename"} ||=
1921     eval "use File::Basename; 1;"; $_ = dirname($_);'
1922   --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
1923   --rpl '{.} s:\.[^/.]+$::'
1924
1925 The B<--plus> replacement strings are implemented as:
1926
1927   --rpl '{+/} s:/[^/]*$::'
1928   --rpl '{+.} s:.*\.::'
1929   --rpl '{+..} s:.*\.([^.]*\.):$1:'
1930   --rpl '{+...} s:.*\.([^.]*\.[^.]*\.):$1:'
1931   --rpl '{..} s:\.[^/.]+$::; s:\.[^/.]+$::'
1932   --rpl '{...} s:\.[^/.]+$::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1933   --rpl '{/..} s:.*/::; s:\.[^/.]+$::; s:\.[^/.]+$::'
1934   --rpl '{/...} s:.*/::;s:\.[^/.]+$::;s:\.[^/.]+$::;s:\.[^/.]+$::'
1935   --rpl '{##} $_=total_jobs()'
1936   --rpl '{:-(.+?)} $_ ||= $$1'
1937   --rpl '{:(\d+?)} substr($_,0,$$1) = ""'
1938   --rpl '{:(\d+?):(\d+?)} $_ = substr($_,$$1,$$2);'
1939   --rpl '{#([^#].*?)} s/^$$1//;'
1940   --rpl '{%(.+?)} s/$$1$//;'
1941   --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1942   --rpl '{^(.+?)} s/^($$1)/uc($1)/e;'
1943   --rpl '{^^(.+?)} s/($$1)/uc($1)/eg;'
1944   --rpl '{,(.+?)} s/^($$1)/lc($1)/e;'
1945   --rpl '{,,(.+?)} s/($$1)/lc($1)/eg;'
1946
1947
1948 If the user defined replacement string starts with '{' it can also be
1949 used as a positional replacement string (like B<{2.}>).
1950
1951 It is recommended to only change $_ but you have full access to all
1952 of GNU B<parallel>'s internal functions and data structures.
1953
1954 Here are a few examples:
1955
1956   Is the job sequence even or odd?
1957   --rpl '{odd} $_ = seq() % 2 ? "odd" : "even"'
1958   Pad job sequence with leading zeros to get equal width
1959   --rpl '{0#} $f=1+int("".(log(total_jobs())/log(10)));
1960     $_=sprintf("%0${f}d",seq())'
1961   Job sequence counting from 0
1962   --rpl '{#0} $_ = seq() - 1'
1963   Job slot counting from 2
1964   --rpl '{%1} $_ = slot() + 1'
1965   Remove all extensions
1966   --rpl '{:} s:(\.[^/]+)*$::'
1967
1968 You can have dynamic replacement strings by including parenthesis in
1969 the replacement string and adding a regular expression between the
1970 parenthesis. The matching string will be inserted as $$1:
1971
1972   parallel --rpl '{%(.*?)} s/$$1//' echo {%.tar.gz} ::: my.tar.gz
1973   parallel --rpl '{:%(.+?)} s:$$1(\.[^/]+)*$::' \
1974     echo {:%_file} ::: my_file.tar.gz
1975   parallel -n3 --rpl '{/:%(.*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:' \
1976     echo job {#}: {2} {2.} {3/:%_1} ::: a/b.c c/d.e f/g_1.h.i
1977
1978 You can even use multiple matches:
1979
1980   parallel --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;'
1981     echo {/replacethis/withthis} {/b/C} ::: a_replacethis_b
1982
1983   parallel --rpl '{(.*?)/(.*?)} $_="$$2$_$$1"' \
1984     echo {swap/these} ::: -middle-
1985
1986 See also: B<{= perl expression =}> B<--parens>
1987
1988
1989 =item B<--rsync-opts> I<options>
1990
1991 Options to pass on to B<rsync>. Setting B<--rsync-opts> takes
1992 precedence over setting the environment variable $PARALLEL_RSYNC_OPTS.
1993
1994
1995 =item B<--max-chars>=I<max-chars>
1996
1997 =item B<-s> I<max-chars>
1998
1999 Use at most I<max-chars> characters per command line, including the
2000 command and initial-arguments and the terminating nulls at the ends of
2001 the argument strings.  The largest allowed value is system-dependent,
2002 and is calculated as the argument length limit for exec, less the size
2003 of your environment.  The default value is the maximum.
2004
2005 Implies B<-X> unless B<-m> is set.
2006
2007
2008 =item B<--show-limits>
2009
2010 Display the limits on the command-line length which are imposed by the
2011 operating system and the B<-s> option.  Pipe the input from /dev/null
2012 (and perhaps specify --no-run-if-empty) if you don't want GNU B<parallel>
2013 to do anything.
2014
2015
2016 =item B<--semaphore>
2017
2018 Work as a counting semaphore. B<--semaphore> will cause GNU
2019 B<parallel> to start I<command> in the background. When the number of
2020 jobs given by B<--jobs> is reached, GNU B<parallel> will wait for one of
2021 these to complete before starting another command.
2022
2023 B<--semaphore> implies B<--bg> unless B<--fg> is specified.
2024
2025 B<--semaphore> implies B<--semaphorename `tty`> unless
2026 B<--semaphorename> is specified.
2027
2028 Used with B<--fg>, B<--wait>, and B<--semaphorename>.
2029
2030 The command B<sem> is an alias for B<parallel --semaphore>.
2031
2032 See also B<man sem>.
2033
2034
2035 =item B<--semaphorename> I<name>
2036
2037 =item B<--id> I<name>
2038
2039 Use B<name> as the name of the semaphore. Default is the name of the
2040 controlling tty (output from B<tty>).
2041
2042 The default normally works as expected when used interactively, but
2043 when used in a script I<name> should be set. I<$$> or I<my_task_name>
2044 are often a good value.
2045
2046 The semaphore is stored in ~/.parallel/semaphores/
2047
2048 Implies B<--semaphore>.
2049
2050 See also B<man sem>.
2051
2052
2053 =item B<--semaphoretimeout> I<secs>
2054
2055 =item B<--st> I<secs>
2056
2057 If I<secs> > 0: If the semaphore is not released within I<secs> seconds, take it anyway.
2058
2059 If I<secs> < 0: If the semaphore is not released within I<secs> seconds, exit.
2060
2061 Implies B<--semaphore>.
2062
2063 See also B<man sem>.
2064
2065
2066 =item B<--seqreplace> I<replace-str>
2067
2068 Use the replacement string I<replace-str> instead of B<{#}> for
2069 job sequence number.
2070
2071
2072 =item B<--session>
2073
2074 Record names in current environment in B<$PARALLEL_IGNORED_NAMES> and
2075 exit. Only used with B<env_parallel>. Aliases, functions, and
2076 variables with names in B<$PARALLEL_IGNORED_NAMES> will not be copied.
2077
2078 Only supported in B<Ash, Bash, Dash, Ksh, Sh, and Zsh>.
2079
2080 See also B<--env>, B<--record-env>.
2081
2082
2083 =item B<--shard> I<shardkey> (beta testing)
2084
2085 Use column I<shardkey> as shard key and shard input to the jobs.
2086
2087 Each input line is split using B<--colsep>. The value in the
2088 I<shardkey> column is hashed so that all lines of a given value is
2089 given to the same job slot.
2090
2091 This is similar to sharding in databases.
2092
2093 The performance is in the order of 100K rows per second. Faster if the
2094 I<shardkey> is small (<10), slower if it is big (>100).
2095
2096 B<--shard> requires B<--pipe> and a fixed numeric value for B<--jobs>.
2097
2098
2099 =item B<--shebang>
2100
2101 =item B<--hashbang>
2102
2103 GNU B<parallel> can be called as a shebang (#!) command as the first
2104 line of a script. The content of the file will be treated as
2105 inputsource.
2106
2107 Like this:
2108
2109   #!/usr/bin/parallel --shebang -r wget
2110
2111   https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
2112   https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
2113   https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
2114
2115 B<--shebang> must be set as the first option.
2116
2117 On FreeBSD B<env> is needed:
2118
2119   #!/usr/bin/env -S parallel --shebang -r wget
2120
2121   https://ftpmirror.gnu.org/parallel/parallel-20120822.tar.bz2
2122   https://ftpmirror.gnu.org/parallel/parallel-20130822.tar.bz2
2123   https://ftpmirror.gnu.org/parallel/parallel-20140822.tar.bz2
2124
2125 There are many limitations of shebang (#!) depending on your operating
2126 system. See details on http://www.in-ulm.de/~mascheck/various/shebang/
2127
2128
2129 =item B<--shebang-wrap>
2130
2131 GNU B<parallel> can parallelize scripts by wrapping the shebang
2132 line. If the program can be run like this:
2133
2134   cat arguments | parallel the_program
2135
2136 then the script can be changed to:
2137
2138   #!/usr/bin/parallel --shebang-wrap /original/parser --options
2139
2140 E.g.
2141
2142   #!/usr/bin/parallel --shebang-wrap /usr/bin/python
2143
2144 If the program can be run like this:
2145
2146   cat data | parallel --pipe the_program
2147
2148 then the script can be changed to:
2149
2150   #!/usr/bin/parallel --shebang-wrap --pipe /orig/parser --opts
2151
2152 E.g.
2153
2154   #!/usr/bin/parallel --shebang-wrap --pipe /usr/bin/perl -w
2155
2156 B<--shebang-wrap> must be set as the first option.
2157
2158
2159 =item B<--shellquote>
2160
2161 Does not run the command but quotes it. Useful for making quoted
2162 composed commands for GNU B<parallel>.
2163
2164 Multiple B<--shellquote> with quote the string multiple times, so
2165 B<parallel --shellquote | parallel --shellquote> can be written as
2166 B<parallel --shellquote --shellquote>.
2167
2168
2169 =item B<--shuf>
2170
2171 Shuffle jobs. When having multiple input sources it is hard to
2172 randomize jobs. --shuf will generate all jobs, and shuffle them before
2173 running them. This is useful to get a quick preview of the results
2174 before running the full batch.
2175
2176
2177 =item B<--skip-first-line>
2178
2179 Do not use the first line of input (used by GNU B<parallel> itself
2180 when called with B<--shebang>).
2181
2182
2183 =item B<--sql> I<DBURL> (obsolete)
2184
2185 Use B<--sqlmaster> instead.
2186
2187
2188 =item B<--sqlmaster> I<DBURL>
2189
2190 Submit jobs via SQL server. I<DBURL> must point to a table, which will
2191 contain the same information as B<--joblog>, the values from the input
2192 sources (stored in columns V1 .. Vn), and the output (stored in
2193 columns Stdout and Stderr).
2194
2195 If I<DBURL> is prepended with '+' GNU B<parallel> assumes the table is
2196 already made with the correct columns and appends the jobs to it.
2197
2198 If I<DBURL> is not prepended with '+' the table will be dropped and
2199 created with the correct amount of V-columns unless
2200
2201 B<--sqlmaster> does not run any jobs, but it creates the values for
2202 the jobs to be run. One or more B<--sqlworker> must be run to actually
2203 execute the jobs.
2204
2205 If B<--wait> is set, GNU B<parallel> will wait for the jobs to
2206 complete.
2207
2208 The format of a DBURL is:
2209
2210   [sql:]vendor://[[user][:pwd]@][host][:port]/[db]/table
2211
2212 E.g.
2213
2214   sql:mysql://hr:hr@localhost:3306/hrdb/jobs
2215   mysql://scott:tiger@my.example.com/pardb/paralleljobs
2216   sql:oracle://scott:tiger@ora.example.com/xe/parjob
2217   postgresql://scott:tiger@pg.example.com/pgdb/parjob
2218   pg:///parjob
2219   sqlite3:///pardb/parjob
2220
2221 It can also be an alias from ~/.sql/aliases:
2222
2223   :myalias mysql:///mydb/paralleljobs
2224
2225
2226 =item B<--sqlandworker> I<DBURL>
2227
2228 Shorthand for: B<--sqlmaster> I<DBURL> B<--sqlworker> I<DBURL>.
2229
2230
2231 =item B<--sqlworker> I<DBURL>
2232
2233 Execute jobs via SQL server. Read the input sources variables from the
2234 table pointed to by I<DBURL>. The I<command> on the command line
2235 should be the same as given by B<--sqlmaster>.
2236
2237 If you have more than one B<--sqlworker> jobs may be run more than
2238 once.
2239
2240 If B<--sqlworker> runs on the local machine, the hostname in the SQL
2241 table will not be ':' but instead the hostname of the machine.
2242
2243
2244 =item B<--ssh> I<sshcommand>
2245
2246 GNU B<parallel> defaults to using B<ssh> for remote access. This can
2247 be overridden with B<--ssh>. It can also be set on a per server
2248 basis (see B<--sshlogin>).
2249
2250
2251 =item B<--sshdelay> I<secs>
2252
2253 Delay starting next ssh by I<secs> seconds. GNU B<parallel> will pause
2254 I<secs> seconds after starting each ssh. I<secs> can be less than 1
2255 seconds.
2256
2257
2258 =item B<-S> I<[@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]>
2259
2260 =item B<-S> I<@hostgroup>
2261
2262 =item B<--sshlogin> I<[@hostgroups/][ncpus/]sshlogin[,[@hostgroups/][ncpus/]sshlogin[,...]]>
2263
2264 =item B<--sshlogin> I<@hostgroup>
2265
2266 Distribute jobs to remote computers. The jobs will be run on a list of
2267 remote computers.
2268
2269 If I<hostgroups> is given, the I<sshlogin> will be added to that
2270 hostgroup. Multiple hostgroups are separated by '+'. The I<sshlogin>
2271 will always be added to a hostgroup named the same as I<sshlogin>.
2272
2273 If only the I<@hostgroup> is given, only the sshlogins in that
2274 hostgroup will be used. Multiple I<@hostgroup> can be given.
2275
2276 GNU B<parallel> will determine the number of CPUs on the remote
2277 computers and run the number of jobs as specified by B<-j>.  If the
2278 number I<ncpus> is given GNU B<parallel> will use this number for
2279 number of CPUs on the host. Normally I<ncpus> will not be
2280 needed.
2281
2282 An I<sshlogin> is of the form:
2283
2284   [sshcommand [options]] [username@]hostname
2285
2286 The sshlogin must not require a password (B<ssh-agent>,
2287 B<ssh-copy-id>, and B<sshpass> may help with that).
2288
2289 The sshlogin ':' is special, it means 'no ssh' and will therefore run
2290 on the local computer.
2291
2292 The sshlogin '..' is special, it read sshlogins from ~/.parallel/sshloginfile or
2293 $XDG_CONFIG_HOME/parallel/sshloginfile
2294
2295 The sshlogin '-' is special, too, it read sshlogins from stdin
2296 (standard input).
2297
2298 To specify more sshlogins separate the sshlogins by comma, newline (in
2299 the same string), or repeat the options multiple times.
2300
2301 For examples: see B<--sshloginfile>.
2302
2303 The remote host must have GNU B<parallel> installed.
2304
2305 B<--sshlogin> is known to cause problems with B<-m> and B<-X>.
2306
2307 B<--sshlogin> is often used with B<--transferfile>, B<--return>,
2308 B<--cleanup>, and B<--trc>.
2309
2310
2311 =item B<--sshloginfile> I<filename>
2312
2313 =item B<--slf> I<filename>
2314
2315 File with sshlogins. The file consists of sshlogins on separate
2316 lines. Empty lines and lines starting with '#' are ignored. Example:
2317
2318   server.example.com
2319   username@server2.example.com
2320   8/my-8-cpu-server.example.com
2321   2/my_other_username@my-dualcore.example.net
2322   # This server has SSH running on port 2222
2323   ssh -p 2222 server.example.net
2324   4/ssh -p 2222 quadserver.example.net
2325   # Use a different ssh program
2326   myssh -p 2222 -l myusername hexacpu.example.net
2327   # Use a different ssh program with default number of CPUs
2328   //usr/local/bin/myssh -p 2222 -l myusername hexacpu
2329   # Use a different ssh program with 6 CPUs
2330   6//usr/local/bin/myssh -p 2222 -l myusername hexacpu
2331   # Assume 16 CPUs on the local computer
2332   16/:
2333   # Put server1 in hostgroup1
2334   @hostgroup1/server1
2335   # Put myusername@server2 in hostgroup1+hostgroup2
2336   @hostgroup1+hostgroup2/myusername@server2
2337   # Force 4 CPUs and put 'ssh -p 2222 server3' in hostgroup1
2338   @hostgroup1/4/ssh -p 2222 server3
2339
2340 When using a different ssh program the last argument must be the hostname.
2341
2342 Multiple B<--sshloginfile> are allowed.
2343
2344 GNU B<parallel> will first look for the file in current dir; if that
2345 fails it look for the file in ~/.parallel.
2346
2347 The sshloginfile '..' is special, it read sshlogins from
2348 ~/.parallel/sshloginfile
2349
2350 The sshloginfile '.' is special, it read sshlogins from
2351 /etc/parallel/sshloginfile
2352
2353 The sshloginfile '-' is special, too, it read sshlogins from stdin
2354 (standard input).
2355
2356 If the sshloginfile is changed it will be re-read when a job finishes
2357 though at most once per second. This makes it possible to add and
2358 remove hosts while running.
2359
2360 This can be used to have a daemon that updates the sshloginfile to
2361 only contain servers that are up:
2362
2363     cp original.slf tmp2.slf
2364     while [ 1 ] ; do
2365       nice parallel --nonall -j0 -k --slf original.slf \
2366         --tag echo | perl 's/\t$//' > tmp.slf
2367       if diff tmp.slf tmp2.slf; then
2368         mv tmp.slf tmp2.slf
2369       fi
2370       sleep 10
2371     done &
2372     parallel --slf tmp2.slf ...
2373
2374
2375 =item B<--slotreplace> I<replace-str>
2376
2377 Use the replacement string I<replace-str> instead of B<{%}> for
2378 job slot number.
2379
2380
2381 =item B<--silent>
2382
2383 Silent.  The job to be run will not be printed. This is the default.
2384 Can be reversed with B<-v>.
2385
2386
2387 =item B<--tty>
2388
2389 Open terminal tty. If GNU B<parallel> is used for starting a program
2390 that accesses the tty (such as an interactive program) then this
2391 option may be needed. It will default to starting only one job at a
2392 time (i.e. B<-j1>), not buffer the output (i.e. B<-u>), and it will
2393 open a tty for the job.
2394
2395 You can of course override B<-j1> and B<-u>.
2396
2397 Using B<--tty> unfortunately means that GNU B<parallel> cannot kill
2398 the jobs (with B<--timeout>, B<--memfree>, or B<--halt>). This is due
2399 to GNU B<parallel> giving each child its own process group, which is
2400 then killed. Process groups are dependant on the tty.
2401
2402
2403 =item B<--tag>
2404
2405 Tag lines with arguments. Each output line will be prepended with the
2406 arguments and TAB (\t). When combined with B<--onall> or B<--nonall>
2407 the lines will be prepended with the sshlogin instead.
2408
2409 B<--tag> is ignored when using B<-u>.
2410
2411
2412 =item B<--tagstring> I<str>
2413
2414 Tag lines with a string. Each output line will be prepended with
2415 I<str> and TAB (\t). I<str> can contain replacement strings such as
2416 B<{}>.
2417
2418 B<--tagstring> is ignored when using B<-u>, B<--onall>, and B<--nonall>.
2419
2420
2421 =item B<--tee>
2422
2423 Pipe all data to all jobs. Used with B<--pipe>/B<--pipepart> and
2424 B<:::>.
2425
2426   seq 1000 | parallel --pipe --tee -v wc {} ::: -w -l -c
2427
2428 How many numbers in 1..1000 contain 0..9, and how many bytes do they
2429 fill:
2430
2431   seq 1000 | parallel --pipe --tee --tag \
2432     'grep {1} | wc {2}' ::: {0..9} ::: -l -c
2433
2434 How many words contain a..z and how many bytes do they fill?
2435
2436   parallel -a /usr/share/dict/words --pipepart --tee --tag \
2437     'grep {1} | wc {2}' ::: {a..z} ::: -l -c
2438
2439
2440 =item B<--termseq> I<sequence>
2441
2442 Termination sequence. When a job is killed due to B<--timeout>,
2443 B<--memfree>, B<--halt>, or abnormal termination of GNU B<parallel>,
2444 I<sequence> determines how the job is killed. The default is:
2445
2446     TERM,200,TERM,100,TERM,50,KILL,25
2447
2448 which sends a TERM signal, waits 200 ms, sends another TERM signal,
2449 waits 100 ms, sends another TERM signal, waits 50 ms, sends a KILL
2450 signal, waits 25 ms, and exits. GNU B<parallel> detects if a process
2451 dies before the waiting time is up.
2452
2453
2454 =item B<--tmpdir> I<dirname>
2455
2456 Directory for temporary files. GNU B<parallel> normally buffers output
2457 into temporary files in /tmp. By setting B<--tmpdir> you can use a
2458 different dir for the files. Setting B<--tmpdir> is equivalent to
2459 setting $TMPDIR.
2460
2461
2462 =item B<--tmux> (Long beta testing)
2463
2464 Use B<tmux> for output. Start a B<tmux> session and run each job in a
2465 window in that session. No other output will be produced.
2466
2467
2468 =item B<--tmuxpane> (Long beta testing)
2469
2470 Use B<tmux> for output but put output into panes in the first window.
2471 Useful if you want to monitor the progress of less than 100 concurrent
2472 jobs.
2473
2474
2475 =item B<--timeout> I<duration>
2476
2477 Time out for command. If the command runs for longer than I<duration>
2478 seconds it will get killed as per B<--termseq>.
2479
2480 If I<duration> is followed by a % then the timeout will dynamically be
2481 computed as a percentage of the median average runtime of successful
2482 jobs. Only values > 100% will make sense.
2483
2484 I<duration> is normally in seconds, but can be floats postfixed with
2485 B<s>, B<m>, B<h>, or B<d> which would multiply the float by 1, 60,
2486 3600, or 86400. Thus these are equivalent: B<--timeout 100000> and
2487 B<--timeout 1d3.5h16.6m4s>.
2488
2489
2490 =item B<--verbose>
2491
2492 =item B<-t>
2493
2494 Print the job to be run on stderr (standard error).
2495
2496 See also B<-v>, B<-p>.
2497
2498
2499 =item B<--transfer>
2500
2501 Transfer files to remote computers. Shorthand for: B<--transferfile {}>.
2502
2503
2504 =item B<--transferfile> I<filename>
2505
2506 =item B<--tf> I<filename>
2507
2508 B<--transferfile> is used with B<--sshlogin> to transfer files to the
2509 remote computers. The files will be transferred using B<rsync> and
2510 will be put relative to the default work dir. If the path contains /./
2511 the remaining path will be relative to the work dir. E.g.
2512
2513   echo foo/bar.txt | parallel --transferfile {} \
2514     --sshlogin server.example.com wc
2515
2516 This will transfer the file I<foo/bar.txt> to the computer
2517 I<server.example.com> to the file I<$HOME/foo/bar.txt> before running
2518 B<wc foo/bar.txt> on I<server.example.com>.
2519
2520   echo /tmp/foo/bar.txt | parallel --transferfile {} \
2521     --sshlogin server.example.com wc
2522
2523 This will transfer the file I</tmp/foo/bar.txt> to the computer
2524 I<server.example.com> to the file I</tmp/foo/bar.txt> before running
2525 B<wc /tmp/foo/bar.txt> on I<server.example.com>.
2526
2527   echo /tmp/./foo/bar.txt | parallel --transferfile {} \
2528     --sshlogin server.example.com wc {= s:.*/./:./: =}
2529
2530 This will transfer the file I</tmp/foo/bar.txt> to the computer
2531 I<server.example.com> to the file I<foo/bar.txt> before running
2532 B<wc ./foo/bar.txt> on I<server.example.com>.
2533
2534 B<--transferfile> is often used with B<--return> and B<--cleanup>. A
2535 shorthand for B<--transferfile {}> is B<--transfer>.
2536
2537 B<--transferfile> is ignored when used with B<--sshlogin :> or when
2538 not used with B<--sshlogin>.
2539
2540
2541 =item B<--trc> I<filename>
2542
2543 Transfer, Return, Cleanup. Shorthand for:
2544
2545 B<--transferfile {}> B<--return> I<filename> B<--cleanup>
2546
2547
2548 =item B<--trim> <n|l|r|lr|rl>
2549
2550 Trim white space in input.
2551
2552 =over 4
2553
2554 =item n
2555
2556 No trim. Input is not modified. This is the default.
2557
2558 =item l
2559
2560 Left trim. Remove white space from start of input. E.g. " a bc " -> "a bc ".
2561
2562 =item r
2563
2564 Right trim. Remove white space from end of input. E.g. " a bc " -> " a bc".
2565
2566 =item lr
2567
2568 =item rl
2569
2570 Both trim. Remove white space from both start and end of input. E.g. "
2571 a bc " -> "a bc". This is the default if B<--colsep> is used.
2572
2573 =back
2574
2575
2576 =item B<--ungroup>
2577
2578 =item B<-u>
2579
2580 Ungroup output.  Output is printed as soon as possible and bypasses
2581 GNU B<parallel> internal processing. This may cause output from
2582 different commands to be mixed thus should only be used if you do not
2583 care about the output. Compare these:
2584
2585   seq 4 | parallel -j0 \
2586     'sleep {};echo -n start{};sleep {};echo {}end'
2587   seq 4 | parallel -u -j0 \
2588     'sleep {};echo -n start{};sleep {};echo {}end'
2589
2590 It also disables B<--tag>. GNU B<parallel> outputs faster with
2591 B<-u>. Compare the speeds of these:
2592
2593   parallel seq ::: 300000000 >/dev/null
2594   parallel -u seq ::: 300000000 >/dev/null
2595   parallel --line-buffer seq ::: 300000000 >/dev/null
2596
2597 Can be reversed with B<--group>.
2598
2599 See also: B<--line-buffer> B<--group>
2600
2601
2602 =item B<--extensionreplace> I<replace-str>
2603
2604 =item B<--er> I<replace-str>
2605
2606 Use the replacement string I<replace-str> instead of B<{.}> for input
2607 line without extension.
2608
2609
2610 =item B<--use-sockets-instead-of-threads>
2611
2612 =item B<--use-cores-instead-of-threads>
2613
2614 =item B<--use-cpus-instead-of-cores> (obsolete)
2615
2616 Determine how GNU B<parallel> counts the number of CPUs. GNU
2617 B<parallel> uses this number when the number of jobslots is computed
2618 relative to the number of CPUs (e.g. 100% or +1).
2619
2620 CPUs can be counted in three different ways:
2621
2622 =over 8
2623
2624 =item sockets
2625
2626 The number of filled CPU sockets (i.e. the number of physical chips).
2627
2628 =item cores
2629
2630 The number of physical cores (i.e. the number of physical compute
2631 cores).
2632
2633 =item threads
2634
2635 The number of hyperthreaded cores (i.e. the number of virtual
2636 cores - with some of them possibly being hyperthreaded)
2637
2638 =back
2639
2640 Normally the number of CPUs is computed as the number of CPU
2641 threads. With B<--use-sockets-instead-of-threads> or
2642 B<--use-cores-instead-of-threads> you can force it to be computed as
2643 the number of filled sockets or number of cores instead.
2644
2645 Most users will not need these options.
2646
2647 B<--use-cpus-instead-of-cores> is a (misleading) alias for
2648 B<--use-sockets-instead-of-threads> and is kept for backwards
2649 compatibility.
2650
2651
2652 =item B<-v>
2653
2654 Verbose.  Print the job to be run on stdout (standard output). Can be reversed
2655 with B<--silent>. See also B<-t>.
2656
2657 Use B<-v> B<-v> to print the wrapping ssh command when running remotely.
2658
2659
2660 =item B<--version>
2661
2662 =item B<-V>
2663
2664 Print the version GNU B<parallel> and exit.
2665
2666
2667 =item B<--workdir> I<mydir>
2668
2669 =item B<--wd> I<mydir>
2670
2671 Files transferred using B<--transferfile> and B<--return> will be
2672 relative to I<mydir> on remote computers, and the command will be
2673 executed in the dir I<mydir>.
2674
2675 The special I<mydir> value B<...> will create working dirs under
2676 B<~/.parallel/tmp/> on the remote computers. If B<--cleanup> is given
2677 these dirs will be removed.
2678
2679 The special I<mydir> value B<.> uses the current working dir.  If the
2680 current working dir is beneath your home dir, the value B<.> is
2681 treated as the relative path to your home dir. This means that if your
2682 home dir is different on remote computers (e.g. if your login is
2683 different) the relative path will still be relative to your home dir.
2684
2685 To see the difference try:
2686
2687   parallel -S server pwd ::: ""
2688   parallel --wd . -S server pwd ::: ""
2689   parallel --wd ... -S server pwd ::: ""
2690
2691 I<mydir> can contain GNU B<parallel>'s replacement strings.
2692
2693
2694 =item B<--wait>
2695
2696 Wait for all commands to complete.
2697
2698 Used with B<--semaphore> or B<--sqlmaster>.
2699
2700 See also B<man sem>.
2701
2702
2703 =item B<-X>
2704
2705 Multiple arguments with context replace. Insert as many arguments as
2706 the command line length permits. If multiple jobs are being run in
2707 parallel: distribute the arguments evenly among the jobs. Use B<-j1>
2708 to avoid this.
2709
2710 If B<{}> is not used the arguments will be appended to the line.  If
2711 B<{}> is used as part of a word (like I<pic{}.jpg>) then the whole
2712 word will be repeated. If B<{}> is used multiple times each B<{}> will
2713 be replaced with the arguments.
2714
2715 Normally B<-X> will do the right thing, whereas B<-m> can give
2716 unexpected results if B<{}> is used as part of a word.
2717
2718 Support for B<-X> with B<--sshlogin> is limited and may fail.
2719
2720 See also B<-m>.
2721
2722
2723 =item B<--exit>
2724
2725 =item B<-x>
2726
2727 Exit if the size (see the B<-s> option) is exceeded.
2728
2729
2730 =back
2731
2732 =head1 EXAMPLE: Working as xargs -n1. Argument appending
2733
2734 GNU B<parallel> can work similar to B<xargs -n1>.
2735
2736 To compress all html files using B<gzip> run:
2737
2738   find . -name '*.html' | parallel gzip --best
2739
2740 If the file names may contain a newline use B<-0>. Substitute FOO BAR with
2741 FUBAR in all files in this dir and subdirs:
2742
2743   find . -type f -print0 | \
2744     parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
2745
2746 Note B<-q> is needed because of the space in 'FOO BAR'.
2747
2748
2749 =head1 EXAMPLE: Simple network scanner
2750
2751 B<prips> can generate IP-addresses from CIDR notation. With GNU
2752 B<parallel> you can build a simple network scanner to see which
2753 addresses respond to B<ping>:
2754
2755   prips 130.229.16.0/20 | \
2756     parallel --timeout 2 -j0 \
2757       'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null
2758
2759
2760 =head1 EXAMPLE: Reading arguments from command line
2761
2762 GNU B<parallel> can take the arguments from command line instead of
2763 stdin (standard input). To compress all html files in the current dir
2764 using B<gzip> run:
2765
2766   parallel gzip --best ::: *.html
2767
2768 To convert *.wav to *.mp3 using LAME running one process per CPU run:
2769
2770   parallel lame {} -o {.}.mp3 ::: *.wav
2771
2772
2773 =head1 EXAMPLE: Inserting multiple arguments
2774
2775 When moving a lot of files like this: B<mv *.log destdir> you will
2776 sometimes get the error:
2777
2778   bash: /bin/mv: Argument list too long
2779
2780 because there are too many files. You can instead do:
2781
2782   ls | grep -E '\.log$' | parallel mv {} destdir
2783
2784 This will run B<mv> for each file. It can be done faster if B<mv> gets
2785 as many arguments that will fit on the line:
2786
2787   ls | grep -E '\.log$' | parallel -m mv {} destdir
2788
2789 In many shells you can also use B<printf>:
2790
2791   printf '%s\0' *.log | parallel -0 -m mv {} destdir
2792
2793
2794 =head1 EXAMPLE: Context replace
2795
2796 To remove the files I<pict0000.jpg> .. I<pict9999.jpg> you could do:
2797
2798   seq -w 0 9999 | parallel rm pict{}.jpg
2799
2800 You could also do:
2801
2802   seq -w 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm
2803
2804 The first will run B<rm> 10000 times, while the last will only run
2805 B<rm> as many times needed to keep the command line length short
2806 enough to avoid B<Argument list too long> (it typically runs 1-2 times).
2807
2808 You could also run:
2809
2810   seq -w 0 9999 | parallel -X rm pict{}.jpg
2811
2812 This will also only run B<rm> as many times needed to keep the command
2813 line length short enough.
2814
2815
2816 =head1 EXAMPLE: Compute intensive jobs and substitution
2817
2818 If ImageMagick is installed this will generate a thumbnail of a jpg
2819 file:
2820
2821   convert -geometry 120 foo.jpg thumb_foo.jpg
2822
2823 This will run with number-of-cpus jobs in parallel for all jpg files
2824 in a directory:
2825
2826   ls *.jpg | parallel convert -geometry 120 {} thumb_{}
2827
2828 To do it recursively use B<find>:
2829
2830   find . -name '*.jpg' | \
2831     parallel convert -geometry 120 {} {}_thumb.jpg
2832
2833 Notice how the argument has to start with B<{}> as B<{}> will include path
2834 (e.g. running B<convert -geometry 120 ./foo/bar.jpg
2835 thumb_./foo/bar.jpg> would clearly be wrong). The command will
2836 generate files like ./foo/bar.jpg_thumb.jpg.
2837
2838 Use B<{.}> to avoid the extra .jpg in the file name. This command will
2839 make files like ./foo/bar_thumb.jpg:
2840
2841   find . -name '*.jpg' | \
2842     parallel convert -geometry 120 {} {.}_thumb.jpg
2843
2844
2845 =head1 EXAMPLE: Substitution and redirection
2846
2847 This will generate an uncompressed version of .gz-files next to the .gz-file:
2848
2849   parallel zcat {} ">"{.} ::: *.gz
2850
2851 Quoting of > is necessary to postpone the redirection. Another
2852 solution is to quote the whole command:
2853
2854   parallel "zcat {} >{.}" ::: *.gz
2855
2856 Other special shell characters (such as * ; $ > < | >> <<) also need
2857 to be put in quotes, as they may otherwise be interpreted by the shell
2858 and not given to GNU B<parallel>.
2859
2860
2861 =head1 EXAMPLE: Composed commands
2862
2863 A job can consist of several commands. This will print the number of
2864 files in each directory:
2865
2866   ls | parallel 'echo -n {}" "; ls {}|wc -l'
2867
2868 To put the output in a file called <name>.dir:
2869
2870   ls | parallel '(echo -n {}" "; ls {}|wc -l) >{}.dir'
2871
2872 Even small shell scripts can be run by GNU B<parallel>:
2873
2874   find . | parallel 'a={}; name=${a##*/};' \
2875     'upper=$(echo "$name" | tr "[:lower:]" "[:upper:]");'\
2876     'echo "$name - $upper"'
2877
2878   ls | parallel 'mv {} "$(echo {} | tr "[:upper:]" "[:lower:]")"'
2879
2880 Given a list of URLs, list all URLs that fail to download. Print the
2881 line number and the URL.
2882
2883   cat urlfile | parallel "wget {} 2>/dev/null || grep -n {} urlfile"
2884
2885 Create a mirror directory with the same filenames except all files and
2886 symlinks are empty files.
2887
2888   cp -rs /the/source/dir mirror_dir
2889   find mirror_dir -type l | parallel -m rm {} '&&' touch {}
2890
2891 Find the files in a list that do not exist
2892
2893   cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi'
2894
2895
2896 =head1 EXAMPLE: Composed command with multiple input sources
2897
2898 You have a dir with files named as 24 hours in 5 minute intervals:
2899 00:00, 00:05, 00:10 .. 23:55. You want to find the files missing:
2900
2901   parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \
2902     ::: {00..23} ::: {00..55..5}
2903
2904
2905 =head1 EXAMPLE: Calling Bash functions
2906
2907 If the composed command is longer than a line, it becomes hard to
2908 read. In Bash you can use functions. Just remember to B<export -f> the
2909 function.
2910
2911   doit() {
2912     echo Doing it for $1
2913     sleep 2
2914     echo Done with $1
2915   }
2916   export -f doit
2917   parallel doit ::: 1 2 3
2918
2919   doubleit() {
2920     echo Doing it for $1 $2
2921     sleep 2
2922     echo Done with $1 $2
2923   }
2924   export -f doubleit
2925   parallel doubleit ::: 1 2 3 ::: a b
2926
2927 To do this on remote servers you need to transfer the function using
2928 B<--env>:
2929
2930   parallel --env doit -S server doit ::: 1 2 3
2931   parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b
2932
2933 If your environment (aliases, variables, and functions) is small you
2934 can copy the full environment without having to B<export -f>
2935 anything. See B<env_parallel>.
2936
2937
2938 =head1 EXAMPLE: Function tester
2939
2940 To test a program with different parameters:
2941
2942   tester() {
2943     if (eval "$@") >&/dev/null; then
2944       perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@"
2945     else
2946       perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@"
2947     fi
2948   }
2949   export -f tester
2950   parallel tester my_program ::: arg1 arg2
2951   parallel tester exit ::: 1 0 2 0
2952
2953 If B<my_program> fails a red FAIL will be printed followed by the failing
2954 command; otherwise a green OK will be printed followed by the command.
2955
2956
2957 =head1 EXAMPLE: Log rotate
2958
2959 Log rotation renames a logfile to an extension with a higher number:
2960 log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is
2961 removed. To avoid overwriting files the process starts backwards from
2962 the high number to the low number.  This will keep 10 old versions of
2963 the log:
2964
2965   seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}'
2966   mv log log.1
2967
2968
2969 =head1 EXAMPLE: Removing file extension when processing files
2970
2971 When processing files removing the file extension using B<{.}> is
2972 often useful.
2973
2974 Create a directory for each zip-file and unzip it in that dir:
2975
2976   parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
2977
2978 Recompress all .gz files in current directory using B<bzip2> running 1
2979 job per CPU in parallel:
2980
2981   parallel "zcat {} | bzip2 >{.}.bz2 && rm {}" ::: *.gz
2982
2983 Convert all WAV files to MP3 using LAME:
2984
2985   find sounddir -type f -name '*.wav' | parallel lame {} -o {.}.mp3
2986
2987 Put all converted in the same directory:
2988
2989   find sounddir -type f -name '*.wav' | \
2990     parallel lame {} -o mydir/{/.}.mp3
2991
2992
2993 =head1 EXAMPLE: Removing strings from the argument
2994
2995 If you have directory with tar.gz files and want these extracted in
2996 the corresponding dir (e.g foo.tar.gz will be extracted in the dir
2997 foo) you can do:
2998
2999   parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
3000
3001 If you want to remove a different ending, you can use {%string}:
3002
3003   parallel --plus echo {%_demo} ::: mycode_demo keep_demo_here
3004
3005 You can also remove a starting string with {#string}
3006
3007   parallel --plus echo {#demo_} ::: demo_mycode keep_demo_here
3008
3009 To remove a string anywhere you can use regular expressions with
3010 {/regexp/replacement} and leave the replacement empty:
3011
3012   parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here
3013
3014
3015 =head1 EXAMPLE: Download 24 images for each of the past 30 days
3016
3017 Let us assume a website stores images like:
3018
3019   http://www.example.com/path/to/YYYYMMDD_##.jpg
3020
3021 where YYYYMMDD is the date and ## is the number 01-24. This will
3022 download images for the past 30 days:
3023
3024   getit() {
3025     date=$(date -d "today -$1 days" +%Y%m%d)
3026     num=$2
3027     echo wget http://www.example.com/path/to/${date}_${num}.jpg
3028   }
3029   export -f getit
3030
3031   parallel getit ::: $(seq 30) ::: $(seq -w 24)
3032
3033 B<$(date -d "today -$1 days" +%Y%m%d)> will give the dates in
3034 YYYYMMDD with B<$1> days subtracted.
3035
3036
3037 =head1 EXAMPLE: Download world map from NASA
3038
3039 NASA provides tiles to download on earthdata.nasa.gov. Download tiles
3040 for Blue Marble world map and create a 10240x20480 map.
3041
3042   base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi
3043   service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0"
3044   layer="LAYER=BlueMarble_ShadedRelief_Bathymetry"
3045   set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5"
3046   tile="TILEROW={1}&TILECOL={2}"
3047   format="FORMAT=image%2Fjpeg"
3048   url="$base?$service&$layer&$set&$tile&$format"
3049
3050   parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39}
3051   parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19}
3052   convert -append line{0..19}.jpg world.jpg
3053
3054
3055 =head1 EXAMPLE: Download Apollo-11 images from NASA using jq
3056
3057 Search NASA using their API to get JSON for images related to 'apollo
3058 11' and has 'moon landing' in the description.
3059
3060 The search query returns JSON containing URLs to JSON containing
3061 collections of pictures. One of the pictures in each of these
3062 collection is I<large>.
3063
3064 B<wget> is used to get the JSON for the search query. B<jq> is then
3065 used to extract the URLs of the collections. B<parallel> then calls
3066 B<wget> to get each collection, which is passed to B<jq> to extract
3067 the URLs of all images. B<grep> filters out the I<large> images, and
3068 B<parallel> finally uses B<wget> to fetch the images.
3069
3070   base="https://images-api.nasa.gov/search"
3071   q="q=apollo 11"
3072   description="description=moon landing"
3073   media_type="media_type=image"
3074   wget -O - "$base?$q&$description&$media_type" |
3075     jq -r .collection.items[].href |
3076     parallel wget -O - |
3077     jq -r .[] |
3078     grep large |
3079     parallel wget
3080
3081
3082 =head1 EXAMPLE: Download video playlist in parallel
3083
3084 B<youtube-dl> is an excellent tool to download videos. It can,
3085 however, not download videos in parallel. This takes a playlist and
3086 downloads 10 videos in parallel.
3087
3088   url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ'
3089   export url
3090   youtube-dl --flat-playlist "https://$url" |
3091     parallel --tagstring {#} --lb -j10 \
3092       youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"'
3093
3094
3095 =head1 EXAMPLE: Prepend last modified date (ISO8601) to file name
3096
3097   parallel mv {} '{= $a=pQ($_); $b=$_;' \
3098     '$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: *
3099
3100 B<{=> and B<=}> mark a perl expression. B<pQ> perl-quotes the
3101 string. B<date +%FT%T> is the date in ISO8601 with time.
3102
3103 =head1 EXAMPLE: Save output in ISO8601 dirs
3104
3105 Save output from B<ps aux> every second into dirs named
3106 yyyy-mm-ddThh:mm:ss+zz:zz.
3107
3108   seq 1000 | parallel -N0 -j1 --delay 1 \
3109     --results '{= $_=`date -Isec`; chomp=}/' ps aux
3110
3111
3112 =head1 EXAMPLE: Digital clock with "blinking" :
3113
3114 The : in a digital clock blinks. To make every other line have a ':'
3115 and the rest a ' ' a perl expression is used to look at the 3rd input
3116 source. If the value modulo 2 is 1: Use ":" otherwise use " ":
3117
3118   parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \
3119     ::: {0..12} ::: {0..5} ::: {0..9}
3120
3121
3122 =head1 EXAMPLE: Aggregating content of files
3123
3124 This:
3125
3126   parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \
3127   ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
3128
3129 will generate the files x1y01z1 .. x5y10z5. If you want to aggregate
3130 the output grouping on x and z you can do this:
3131
3132   parallel eval 'cat {=s/y01/y*/=} > {=s/y01//=}' ::: *y01*
3133
3134 For all values of x and z it runs commands like:
3135
3136   cat x1y*z1 > x1z1
3137
3138 So you end up with x1z1 .. x5z5 each containing the content of all
3139 values of y.
3140
3141
3142 =head1 EXAMPLE: Breadth first parallel web crawler/mirrorer
3143
3144 This script below will crawl and mirror a URL in parallel.  It
3145 downloads first pages that are 1 click down, then 2 clicks down, then
3146 3; instead of the normal depth first, where the first link link on
3147 each page is fetched first.
3148
3149 Run like this:
3150
3151   PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/
3152
3153 Remove the B<wget> part if you only want a web crawler.
3154
3155 It works by fetching a page from a list of URLs and looking for links
3156 in that page that are within the same starting URL and that have not
3157 already been seen. These links are added to a new queue. When all the
3158 pages from the list is done, the new queue is moved to the list of
3159 URLs and the process is started over until no unseen links are found.
3160
3161   #!/bin/bash
3162
3163   # E.g. http://gatt.org.yeslab.org/
3164   URL=$1
3165   # Stay inside the start dir
3166   BASEURL=$(echo $URL | perl -pe 's:#.*::; s:(//.*/)[^/]*:$1:')
3167   URLLIST=$(mktemp urllist.XXXX)
3168   URLLIST2=$(mktemp urllist.XXXX)
3169   SEEN=$(mktemp seen.XXXX)
3170
3171   # Spider to get the URLs
3172   echo $URL >$URLLIST
3173   cp $URLLIST $SEEN
3174
3175   while [ -s $URLLIST ] ; do
3176     cat $URLLIST |
3177       parallel lynx -listonly -image_links -dump {} \; \
3178         wget -qm -l1 -Q1 {} \; echo Spidered: {} \>\&2 |
3179         perl -ne 's/#.*//; s/\s+\d+.\s(\S+)$/$1/ and
3180           do { $seen{$1}++ or print }' |
3181       grep -F $BASEURL |
3182       grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
3183     mv $URLLIST2 $URLLIST
3184   done
3185
3186   rm -f $URLLIST $URLLIST2 $SEEN
3187
3188
3189 =head1 EXAMPLE: Process files from a tar file while unpacking
3190
3191 If the files to be processed are in a tar file then unpacking one file
3192 and processing it immediately may be faster than first unpacking all
3193 files.
3194
3195   tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \
3196     parallel echo
3197
3198 The Perl one-liner is needed to make sure the file is complete before
3199 handing it to GNU B<parallel>.
3200
3201
3202 =head1 EXAMPLE: Rewriting a for-loop and a while-read-loop
3203
3204 for-loops like this:
3205
3206   (for x in `cat list` ; do
3207     do_something $x
3208   done) | process_output
3209
3210 and while-read-loops like this:
3211
3212   cat list | (while read x ; do
3213     do_something $x
3214   done) | process_output
3215
3216 can be written like this:
3217
3218   cat list | parallel do_something | process_output
3219
3220 For example: Find which host name in a list has IP address 1.2.3 4:
3221
3222   cat hosts.txt | parallel -P 100 host | grep 1.2.3.4
3223
3224 If the processing requires more steps the for-loop like this:
3225
3226   (for x in `cat list` ; do
3227     no_extension=${x%.*};
3228     do_step1 $x scale $no_extension.jpg
3229     do_step2 <$x $no_extension
3230   done) | process_output
3231
3232 and while-loops like this:
3233
3234   cat list | (while read x ; do
3235     no_extension=${x%.*};
3236     do_step1 $x scale $no_extension.jpg
3237     do_step2 <$x $no_extension
3238   done) | process_output
3239
3240 can be written like this:
3241
3242   cat list | parallel "do_step1 {} scale {.}.jpg ; do_step2 <{} {.}" |\
3243     process_output
3244
3245 If the body of the loop is bigger, it improves readability to use a function:
3246
3247   (for x in `cat list` ; do
3248     do_something $x
3249     [... 100 lines that do something with $x ...]
3250   done) | process_output
3251
3252   cat list | (while read x ; do
3253     do_something $x
3254     [... 100 lines that do something with $x ...]
3255   done) | process_output
3256
3257 can both be rewritten as:
3258
3259   doit() {
3260     x=$1
3261     do_something $x
3262     [... 100 lines that do something with $x ...]
3263   }
3264   export -f doit
3265   cat list | parallel doit
3266
3267 =head1 EXAMPLE: Rewriting nested for-loops
3268
3269 Nested for-loops like this:
3270
3271   (for x in `cat xlist` ; do
3272     for y in `cat ylist` ; do
3273       do_something $x $y
3274     done
3275   done) | process_output
3276
3277 can be written like this:
3278
3279   parallel do_something {1} {2} :::: xlist ylist | process_output
3280
3281 Nested for-loops like this:
3282
3283   (for colour in red green blue ; do
3284     for size in S M L XL XXL ; do
3285       echo $colour $size
3286     done
3287   done) | sort
3288
3289 can be written like this:
3290
3291   parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort
3292
3293
3294 =head1 EXAMPLE: Finding the lowest difference between files
3295
3296 B<diff> is good for finding differences in text files. B<diff | wc -l>
3297 gives an indication of the size of the difference. To find the
3298 differences between all files in the current dir do:
3299
3300   parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3
3301
3302 This way it is possible to see if some files are closer to other
3303 files.
3304
3305
3306 =head1 EXAMPLE: for-loops with column names
3307
3308 When doing multiple nested for-loops it can be easier to keep track of
3309 the loop variable if is is named instead of just having a number. Use
3310 B<--header :> to let the first argument be an named alias for the
3311 positional replacement string:
3312
3313   parallel --header : echo {colour} {size} \
3314     ::: colour red green blue ::: size S M L XL XXL
3315
3316 This also works if the input file is a file with columns:
3317
3318   cat addressbook.tsv | \
3319     parallel --colsep '\t' --header : echo {Name} {E-mail address}
3320
3321
3322 =head1 EXAMPLE: All combinations in a list
3323
3324 GNU B<parallel> makes all combinations when given two lists.
3325
3326 To make all combinations in a single list with unique values, you
3327 repeat the list and use replacement string B<{choose_k}>:
3328
3329   parallel --plus echo {choose_k} ::: A B C D ::: A B C D
3330
3331   parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D
3332
3333 B<{choose_k}> works for any number of input sources:
3334
3335   parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D
3336
3337
3338 =head1 EXAMPLE: From a to b and b to c
3339
3340 Assume you have input like:
3341
3342   aardvark
3343   babble
3344   cab
3345   dab
3346   each
3347
3348 and want to run combinations like:
3349
3350   aardvark babble
3351   babble cab
3352   cab dab
3353   dab each
3354
3355 If the input is in the file in.txt:
3356
3357   parallel echo {1} - {2} ::::+ <(head -n -1 in.txt) <(tail -n +2 in.txt)
3358
3359 If the input is in the array $a here are two solutions:
3360
3361   seq $((${#a[@]}-1)) | \
3362     env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}'
3363   parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}"
3364
3365
3366 =head1 EXAMPLE: Count the differences between all files in a dir
3367
3368 Using B<--results> the results are saved in /tmp/diffcount*.
3369
3370   parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \
3371     tail -n +3 |grep -v '^@'|wc -l" ::: * ::: *
3372
3373 To see the difference between file A and file B look at the file
3374 '/tmp/diffcount/1/A/2/B'.
3375
3376
3377 =head1 EXAMPLE: Speeding up fast jobs
3378
3379 Starting a job on the local machine takes around 10 ms. This can be a
3380 big overhead if the job takes very few ms to run. Often you can group
3381 small jobs together using B<-X> which will make the overhead less
3382 significant. Compare the speed of these:
3383
3384   seq -w 0 9999 | parallel touch pict{}.jpg
3385   seq -w 0 9999 | parallel -X touch pict{}.jpg
3386
3387 If your program cannot take multiple arguments, then you can use GNU
3388 B<parallel> to spawn multiple GNU B<parallel>s:
3389
3390   seq -w 0 9999999 | \
3391     parallel -j10 -q -I,, --pipe parallel -j0 touch pict{}.jpg
3392
3393 If B<-j0> normally spawns 252 jobs, then the above will try to spawn
3394 2520 jobs. On a normal GNU/Linux system you can spawn 32000 jobs using
3395 this technique with no problems. To raise the 32000 jobs limit raise
3396 /proc/sys/kernel/pid_max to 4194303.
3397
3398 If you do not need GNU B<parallel> to have control over each job (so
3399 no need for B<--retries> or B<--joblog> or similar), then it can be
3400 even faster if you can generate the command lines and pipe those to a
3401 shell. So if you can do this:
3402
3403   mygenerator | sh
3404
3405 Then that can be parallelized like this:
3406
3407   mygenerator | parallel --pipe --block 10M sh
3408
3409 E.g.
3410
3411   mygenerator() {
3412     seq 10000000 | perl -pe 'print "echo This is fast job number "';
3413   }
3414   mygenerator | parallel --pipe --block 10M sh
3415
3416 The overhead is 100000 times smaller namely around 100 nanoseconds per
3417 job.
3418
3419
3420 =head1 EXAMPLE: Using shell variables
3421
3422 When using shell variables you need to quote them correctly as they
3423 may otherwise be interpreted by the shell.
3424
3425 Notice the difference between:
3426
3427   ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3428   parallel echo ::: ${ARR[@]} # This is probably not what you want
3429
3430 and:
3431
3432   ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
3433   parallel echo ::: "${ARR[@]}"
3434
3435 When using variables in the actual command that contains special
3436 characters (e.g. space) you can quote them using B<'"$VAR"'> or using
3437 "'s and B<-q>:
3438
3439   VAR="My brother's 12\" records are worth <\$\$\$>"
3440   parallel -q echo "$VAR" ::: '!'
3441   export VAR
3442   parallel echo '"$VAR"' ::: '!'
3443
3444 If B<$VAR> does not contain ' then B<"'$VAR'"> will also work
3445 (and does not need B<export>):
3446
3447   VAR="My 12\" records are worth <\$\$\$>"
3448   parallel echo "'$VAR'" ::: '!'
3449
3450 If you use them in a function you just quote as you normally would do:
3451
3452   VAR="My brother's 12\" records are worth <\$\$\$>"
3453   export VAR
3454   myfunc() { echo "$VAR" "$1"; }
3455   export -f myfunc
3456   parallel myfunc ::: '!'
3457
3458
3459 =head1 EXAMPLE: Group output lines
3460
3461 When running jobs that output data, you often do not want the output
3462 of multiple jobs to run together. GNU B<parallel> defaults to grouping
3463 the output of each job, so the output is printed when the job
3464 finishes. If you want full lines to be printed while the job is
3465 running you can use B<--line-buffer>. If you want output to be
3466 printed as soon as possible you can use B<-u>.
3467
3468 Compare the output of:
3469
3470   parallel wget --limit-rate=100k \
3471     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3472     ::: {12..16}
3473   parallel --line-buffer wget --limit-rate=100k \
3474     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3475     ::: {12..16}
3476   parallel -u wget --limit-rate=100k \
3477     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3478     ::: {12..16}
3479
3480 =head1 EXAMPLE: Tag output lines
3481
3482 GNU B<parallel> groups the output lines, but it can be hard to see
3483 where the different jobs begin. B<--tag> prepends the argument to make
3484 that more visible:
3485
3486   parallel --tag wget --limit-rate=100k \
3487     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3488     ::: {12..16}
3489
3490 B<--tag> works with B<--line-buffer> but not with B<-u>:
3491
3492   parallel --tag --line-buffer wget --limit-rate=100k \
3493     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
3494     ::: {12..16}
3495
3496 Check the uptime of the servers in I<~/.parallel/sshloginfile>:
3497
3498   parallel --tag -S .. --nonall uptime
3499
3500
3501 =head1 EXAMPLE: Colorize output
3502
3503 Give each job a new color. Most terminals support ANSI colors with the
3504 escape code "\033[30;3Xm" where 0 <= X <= 7:
3505
3506     seq 10 | \
3507       parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {}
3508     parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \
3509       --tagstring {color} seq {} ::: {1..10}
3510
3511 To get rid of the initial \t (which comes from B<--tagstring>):
3512
3513     ... | perl -pe 's/\t//'
3514
3515
3516 =head1 EXAMPLE: Keep order of output same as order of input
3517
3518 Normally the output of a job will be printed as soon as it
3519 completes. Sometimes you want the order of the output to remain the
3520 same as the order of the input. This is often important, if the output
3521 is used as input for another system. B<-k> will make sure the order of
3522 output will be in the same order as input even if later jobs end
3523 before earlier jobs.
3524
3525 Append a string to every line in a text file:
3526
3527   cat textfile | parallel -k echo {} append_string
3528
3529 If you remove B<-k> some of the lines may come out in the wrong order.
3530
3531 Another example is B<traceroute>:
3532
3533   parallel traceroute ::: qubes-os.org debian.org freenetproject.org
3534
3535 will give traceroute of qubes-os.org, debian.org and
3536 freenetproject.org, but it will be sorted according to which job
3537 completed first.
3538
3539 To keep the order the same as input run:
3540
3541   parallel -k traceroute ::: qubes-os.org debian.org freenetproject.org
3542
3543 This will make sure the traceroute to qubes-os.org will be printed
3544 first.
3545
3546 A bit more complex example is downloading a huge file in chunks in
3547 parallel: Some internet connections will deliver more data if you
3548 download files in parallel. For downloading files in parallel see:
3549 "EXAMPLE: Download 10 images for each of the past 30 days". But if you
3550 are downloading a big file you can download the file in chunks in
3551 parallel.
3552
3553 To download byte 10000000-19999999 you can use B<curl>:
3554
3555   curl -r 10000000-19999999 http://example.com/the/big/file >file.part
3556
3557 To download a 1 GB file we need 100 10MB chunks downloaded and
3558 combined in the correct order.
3559
3560   seq 0 99 | parallel -k curl -r \
3561     {}0000000-{}9999999 http://example.com/the/big/file > file
3562
3563
3564 =head1 EXAMPLE: Parallel grep
3565
3566 B<grep -r> greps recursively through directories. On multicore CPUs
3567 GNU B<parallel> can often speed this up.
3568
3569   find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
3570
3571 This will run 1.5 job per CPU, and give 1000 arguments to B<grep>.
3572
3573
3574 =head1 EXAMPLE: Grepping n lines for m regular expressions.
3575
3576 The simplest solution to grep a big file for a lot of regexps is:
3577
3578   grep -f regexps.txt bigfile
3579
3580 Or if the regexps are fixed strings:
3581
3582   grep -F -f regexps.txt bigfile
3583
3584 There are 3 limiting factors: CPU, RAM, and disk I/O.
3585
3586 RAM is easy to measure: If the B<grep> process takes up most of your
3587 free memory (e.g. when running B<top>), then RAM is a limiting factor.
3588
3589 CPU is also easy to measure: If the B<grep> takes >90% CPU in B<top>,
3590 then the CPU is a limiting factor, and parallelization will speed this
3591 up.
3592
3593 It is harder to see if disk I/O is the limiting factor, and depending
3594 on the disk system it may be faster or slower to parallelize. The only
3595 way to know for certain is to test and measure.
3596
3597
3598 =head2 Limiting factor: RAM
3599
3600 The normal B<grep -f regexs.txt bigfile> works no matter the size of
3601 bigfile, but if regexps.txt is so big it cannot fit into memory, then
3602 you need to split this.
3603
3604 B<grep -F> takes around 100 bytes of RAM and B<grep> takes about 500
3605 bytes of RAM per 1 byte of regexp. So if regexps.txt is 1% of your
3606 RAM, then it may be too big.
3607
3608 If you can convert your regexps into fixed strings do that. E.g. if
3609 the lines you are looking for in bigfile all looks like:
3610
3611   ID1 foo bar baz Identifier1 quux
3612   fubar ID2 foo bar baz Identifier2
3613
3614 then your regexps.txt can be converted from:
3615
3616   ID1.*Identifier1
3617   ID2.*Identifier2
3618
3619 into:
3620
3621   ID1 foo bar baz Identifier1
3622   ID2 foo bar baz Identifier2
3623
3624 This way you can use B<grep -F> which takes around 80% less memory and
3625 is much faster.
3626
3627 If it still does not fit in memory you can do this:
3628
3629   parallel --pipepart -a regexps.txt --block 1M grep -Ff - -n bigfile | \
3630     sort -un | perl -pe 's/^\d+://'
3631
3632 The 1M should be your free memory divided by the number of CPU threads and
3633 divided by 200 for B<grep -F> and by 1000 for normal B<grep>. On
3634 GNU/Linux you can do:
3635
3636   free=$(awk '/^((Swap)?Cached|MemFree|Buffers):/ { sum += $2 }
3637               END { print sum }' /proc/meminfo)
3638   percpu=$((free / 200 / $(parallel --number-of-threads)))k
3639
3640   parallel --pipepart -a regexps.txt --block $percpu --compress \
3641     grep -F -f - -n bigfile | \
3642     sort -un | perl -pe 's/^\d+://'
3643
3644 If you can live with duplicated lines and wrong order, it is faster to do:
3645
3646   parallel --pipepart -a regexps.txt --block $percpu --compress \
3647     grep -F -f - bigfile
3648
3649 =head2 Limiting factor: CPU
3650
3651 If the CPU is the limiting factor parallelization should be done on
3652 the regexps:
3653
3654   cat regexp.txt | parallel --pipe -L1000 --roundrobin --compress \
3655     grep -f - -n bigfile | \
3656     sort -un | perl -pe 's/^\d+://'
3657
3658 The command will start one B<grep> per CPU and read I<bigfile> one
3659 time per CPU, but as that is done in parallel, all reads except the
3660 first will be cached in RAM. Depending on the size of I<regexp.txt> it
3661 may be faster to use B<--block 10m> instead of B<-L1000>.
3662
3663 Some storage systems perform better when reading multiple chunks in
3664 parallel. This is true for some RAID systems and for some network file
3665 systems. To parallelize the reading of I<bigfile>:
3666
3667   parallel --pipepart --block 100M -a bigfile -k --compress \
3668     grep -f regexp.txt
3669
3670 This will split I<bigfile> into 100MB chunks and run B<grep> on each of
3671 these chunks. To parallelize both reading of I<bigfile> and I<regexp.txt>
3672 combine the two using B<--fifo>:
3673
3674   parallel --pipepart --block 100M -a bigfile --fifo cat regexp.txt \
3675     \| parallel --pipe -L1000 --roundrobin grep -f - {}
3676
3677 If a line matches multiple regexps, the line may be duplicated.
3678
3679 =head2 Bigger problem
3680
3681 If the problem is too big to be solved by this, you are probably ready
3682 for Lucene.
3683
3684
3685 =head1 EXAMPLE: Using remote computers
3686
3687 To run commands on a remote computer SSH needs to be set up and you
3688 must be able to login without entering a password (The commands
3689 B<ssh-copy-id>, B<ssh-agent>, and B<sshpass> may help you do that).
3690
3691 If you need to login to a whole cluster, you typically do not want to
3692 accept the host key for every host. You want to accept them the first
3693 time and be warned if they are ever changed. To do that:
3694
3695   # Add the servers to the sshloginfile
3696   (echo servera; echo serverb) > .parallel/my_cluster
3697   # Make sure .ssh/config exist
3698   touch .ssh/config
3699   cp .ssh/config .ssh/config.backup
3700   # Disable StrictHostKeyChecking temporarily
3701   (echo 'Host *'; echo StrictHostKeyChecking no) >> .ssh/config
3702   parallel --slf my_cluster --nonall true
3703   # Remove the disabling of StrictHostKeyChecking
3704   mv .ssh/config.backup .ssh/config
3705
3706 The servers in B<.parallel/my_cluster> are now added in B<.ssh/known_hosts>.
3707
3708 To run B<echo> on B<server.example.com>:
3709
3710   seq 10 | parallel --sshlogin server.example.com echo
3711
3712 To run commands on more than one remote computer run:
3713
3714   seq 10 | parallel --sshlogin s1.example.com,s2.example.net echo
3715
3716 Or:
3717
3718   seq 10 | parallel --sshlogin server.example.com \
3719     --sshlogin server2.example.net echo
3720
3721 If the login username is I<foo> on I<server2.example.net> use:
3722
3723   seq 10 | parallel --sshlogin server.example.com \
3724     --sshlogin foo@server2.example.net echo
3725
3726 If your list of hosts is I<server1-88.example.net> with login I<foo>:
3727
3728   seq 10 | parallel -Sfoo@server{1..88}.example.net echo
3729
3730 To distribute the commands to a list of computers, make a file
3731 I<mycomputers> with all the computers:
3732
3733   server.example.com
3734   foo@server2.example.com
3735   server3.example.com
3736
3737 Then run:
3738
3739   seq 10 | parallel --sshloginfile mycomputers echo
3740
3741 To include the local computer add the special sshlogin ':' to the list:
3742
3743   server.example.com
3744   foo@server2.example.com
3745   server3.example.com
3746   :
3747
3748 GNU B<parallel> will try to determine the number of CPUs on each of
3749 the remote computers, and run one job per CPU - even if the remote
3750 computers do not have the same number of CPUs.
3751
3752 If the number of CPUs on the remote computers is not identified
3753 correctly the number of CPUs can be added in front. Here the computer
3754 has 8 CPUs.
3755
3756   seq 10 | parallel --sshlogin 8/server.example.com echo
3757
3758
3759 =head1 EXAMPLE: Transferring of files
3760
3761 To recompress gzipped files with B<bzip2> using a remote computer run:
3762
3763   find logs/ -name '*.gz' | \
3764     parallel --sshlogin server.example.com \
3765     --transfer "zcat {} | bzip2 -9 >{.}.bz2"
3766
3767 This will list the .gz-files in the I<logs> directory and all
3768 directories below. Then it will transfer the files to
3769 I<server.example.com> to the corresponding directory in
3770 I<$HOME/logs>. On I<server.example.com> the file will be recompressed
3771 using B<zcat> and B<bzip2> resulting in the corresponding file with
3772 I<.gz> replaced with I<.bz2>.
3773
3774 If you want the resulting bz2-file to be transferred back to the local
3775 computer add I<--return {.}.bz2>:
3776
3777   find logs/ -name '*.gz' | \
3778     parallel --sshlogin server.example.com \
3779     --transfer --return {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3780
3781 After the recompressing is done the I<.bz2>-file is transferred back to
3782 the local computer and put next to the original I<.gz>-file.
3783
3784 If you want to delete the transferred files on the remote computer add
3785 I<--cleanup>. This will remove both the file transferred to the remote
3786 computer and the files transferred from the remote computer:
3787
3788   find logs/ -name '*.gz' | \
3789     parallel --sshlogin server.example.com \
3790     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3791
3792 If you want run on several computers add the computers to I<--sshlogin>
3793 either using ',' or multiple I<--sshlogin>:
3794
3795   find logs/ -name '*.gz' | \
3796     parallel --sshlogin server.example.com,server2.example.com \
3797     --sshlogin server3.example.com \
3798     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3799
3800 You can add the local computer using I<--sshlogin :>. This will disable the
3801 removing and transferring for the local computer only:
3802
3803   find logs/ -name '*.gz' | \
3804     parallel --sshlogin server.example.com,server2.example.com \
3805     --sshlogin server3.example.com \
3806     --sshlogin : \
3807     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
3808
3809 Often I<--transfer>, I<--return> and I<--cleanup> are used together. They can be
3810 shortened to I<--trc>:
3811
3812   find logs/ -name '*.gz' | \
3813     parallel --sshlogin server.example.com,server2.example.com \
3814     --sshlogin server3.example.com \
3815     --sshlogin : \
3816     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3817
3818 With the file I<mycomputers> containing the list of computers it becomes:
3819
3820   find logs/ -name '*.gz' | parallel --sshloginfile mycomputers \
3821     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3822
3823 If the file I<~/.parallel/sshloginfile> contains the list of computers
3824 the special short hand I<-S ..> can be used:
3825
3826   find logs/ -name '*.gz' | parallel -S .. \
3827     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
3828
3829
3830 =head1 EXAMPLE: Distributing work to local and remote computers
3831
3832 Convert *.mp3 to *.ogg running one process per CPU on local computer
3833 and server2:
3834
3835   parallel --trc {.}.ogg -S server2,: \
3836     'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3
3837
3838
3839 =head1 EXAMPLE: Running the same command on remote computers
3840
3841 To run the command B<uptime> on remote computers you can do:
3842
3843   parallel --tag --nonall -S server1,server2 uptime
3844
3845 B<--nonall> reads no arguments. If you have a list of jobs you want
3846 to run on each computer you can do:
3847
3848   parallel --tag --onall -S server1,server2 echo ::: 1 2 3
3849
3850 Remove B<--tag> if you do not want the sshlogin added before the
3851 output.
3852
3853 If you have a lot of hosts use '-j0' to access more hosts in parallel.
3854
3855
3856 =head1 EXAMPLE: Using remote computers behind NAT wall
3857
3858 If the workers are behind a NAT wall, you need some trickery to get to
3859 them.
3860
3861 If you can B<ssh> to a jumphost, and reach the workers from there,
3862 then the obvious solution would be this, but it B<does not work>:
3863
3864   parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
3865
3866 It does not work because the command is dequoted by B<ssh> twice where
3867 as GNU B<parallel> only expects it to be dequoted once.
3868
3869 So instead put this in B<~/.ssh/config>:
3870
3871   Host host1 host2 host3
3872     ProxyCommand ssh jumphost.domain nc -w 1 %h 22
3873
3874 It requires B<nc(netcat)> to be installed on jumphost. With this you
3875 can simply:
3876
3877   parallel -S host1,host2,host3 echo ::: This does work
3878
3879 =head2 No jumphost, but port forwards
3880
3881 If there is no jumphost but each server has port 22 forwarded from the
3882 firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 = host2,
3883 22003 = host3) then you can use B<~/.ssh/config>:
3884
3885   Host host1.v
3886     Port 22001
3887   Host host2.v
3888     Port 22002
3889   Host host3.v
3890     Port 22003
3891   Host *.v
3892     Hostname firewall
3893
3894 And then use host{1..3}.v as normal hosts:
3895
3896   parallel -S host1.v,host2.v,host3.v echo ::: a b c
3897
3898 =head2 No jumphost, no port forwards
3899
3900 If ports cannot be forwarded, you need some sort of VPN to traverse
3901 the NAT-wall. TOR is one options for that, as it is very easy to get
3902 working.
3903
3904 You need to install TOR and setup a hidden service. In B<torrc> put:
3905
3906   HiddenServiceDir /var/lib/tor/hidden_service/
3907   HiddenServicePort 22 127.0.0.1:22
3908
3909 Then start TOR: B</etc/init.d/tor restart>
3910
3911 The TOR hostname is now in B</var/lib/tor/hidden_service/hostname> and
3912 is something similar to B<izjafdceobowklhz.onion>. Now you simply
3913 prepend B<torsocks> to B<ssh>:
3914
3915   parallel --ssh 'torsocks ssh' -S izjafdceobowklhz.onion \
3916     -S zfcdaeiojoklbwhz.onion,auclucjzobowklhi.onion echo ::: a b c
3917
3918 If not all hosts are accessible through TOR:
3919
3920   parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \
3921     echo ::: a b c
3922
3923 See more B<ssh> tricks on https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts
3924
3925
3926 =head1 EXAMPLE: Parallelizing rsync
3927
3928 B<rsync> is a great tool, but sometimes it will not fill up the
3929 available bandwidth. This is often a problem when copying several big
3930 files over high speed connections.
3931
3932 The following will start one B<rsync> per big file in I<src-dir> to
3933 I<dest-dir> on the server I<fooserver>:
3934
3935   cd src-dir; find . -type f -size +100000 | \
3936     parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
3937       rsync -s -Havessh {} fooserver:/dest-dir/{}
3938
3939 The dirs created may end up with wrong permissions and smaller files
3940 are not being transferred. To fix those run B<rsync> a final time:
3941
3942   rsync -Havessh src-dir/ fooserver:/dest-dir/
3943
3944 If you are unable to push data, but need to pull them and the files
3945 are called digits.png (e.g. 000000.png) you might be able to do:
3946
3947   seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/
3948
3949
3950 =head1 EXAMPLE: Use multiple inputs in one command
3951
3952 Copy files like foo.es.ext to foo.ext:
3953
3954   ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2}
3955
3956 The perl command spits out 2 lines for each input. GNU B<parallel>
3957 takes 2 inputs (using B<-N2>) and replaces {1} and {2} with the inputs.
3958
3959 Count in binary:
3960
3961   parallel -k echo ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1
3962
3963 Print the number on the opposing sides of a six sided die:
3964
3965   parallel --link -a <(seq 6) -a <(seq 6 -1 1) echo
3966   parallel --link echo :::: <(seq 6) <(seq 6 -1 1)
3967
3968 Convert files from all subdirs to PNG-files with consecutive numbers
3969 (useful for making input PNG's for B<ffmpeg>):
3970
3971   parallel --link -a <(find . -type f | sort) \
3972     -a <(seq $(find . -type f|wc -l)) convert {1} {2}.png
3973
3974 Alternative version:
3975
3976   find . -type f | sort | parallel convert {} {#}.png
3977
3978
3979 =head1 EXAMPLE: Use a table as input
3980
3981 Content of table_file.tsv:
3982
3983   foo<TAB>bar
3984   baz <TAB> quux
3985
3986 To run:
3987
3988   cmd -o bar -i foo
3989   cmd -o quux -i baz
3990
3991 you can run:
3992
3993   parallel -a table_file.tsv --colsep '\t' cmd -o {2} -i {1}
3994
3995 Note: The default for GNU B<parallel> is to remove the spaces around
3996 the columns. To keep the spaces:
3997
3998   parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1}
3999
4000
4001 =head1 EXAMPLE: Output to database
4002
4003 GNU B<parallel> can output to a database table and a CSV-file:
4004
4005   DBURL=csv:///%2Ftmp%2Fmy.csv
4006   DBTABLEURL=$DBURL/mytable
4007   parallel --sqlandworker $DBTABLEURL seq ::: {1..10}
4008
4009 It is rather slow and takes up a lot of CPU time because GNU
4010 B<parallel> parses the whole CSV file for each update.
4011
4012 A better approach is to use an SQLite-base and then convert that to CSV:
4013
4014   DBURL=sqlite3:///%2Ftmp%2Fmy.sqlite
4015   DBTABLEURL=$DBURL/mytable
4016   parallel --sqlandworker $DBTABLEURL seq ::: {1..10}
4017   sql $DBURL '.headers on' '.mode csv' 'SELECT * FROM mytable;'
4018
4019 This takes around a second per job.
4020
4021 If you have access to a real database system, such as PostgreSQL, it
4022 is even faster:
4023
4024   DBURL=pg://user:pass@host/mydb
4025   DBTABLEURL=$DBURL/mytable
4026   parallel --sqlandworker $DBTABLEURL seq ::: {1..10}
4027   sql $DBURL \
4028     "COPY (SELECT * FROM mytable) TO stdout DELIMITER ',' CSV HEADER;"
4029
4030 Or MySQL:
4031
4032   DBURL=mysql://user:pass@host/mydb
4033   DBTABLEURL=$DBURL/mytable
4034   parallel --sqlandworker $DBTABLEURL seq ::: {1..10}
4035   sql -p -B $DBURL "SELECT * FROM mytable;" > mytable.tsv
4036   perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/; s/\\\\/\\/g;
4037     s/\\t/\t/g; s/\\n/\n/g;' mytable.tsv
4038
4039
4040 =head1 EXAMPLE: Output to CSV-file for R
4041
4042 If you have no need for the advanced job distribution control that a
4043 database provides, but you simply want output into a CSV file that you
4044 can read into R or LibreCalc, then you can use B<--results>:
4045
4046   parallel --results my.csv seq ::: 10 20 30
4047   R
4048   > mydf <- read.csv("my.csv");
4049   > print(mydf[2,])
4050   > write(as.character(mydf[2,c("Stdout")]),'')
4051
4052
4053 =head1 EXAMPLE: Use XML as input
4054
4055 The show Aflyttet on Radio 24syv publishes an RSS feed with their audio
4056 podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232
4057
4058 Using B<xpath> you can extract the URLs for 2019 and download them
4059 using GNU B<parallel>:
4060
4061   wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \
4062     xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \
4063     parallel -u wget '{= s/ url="//; s/"//; =}'
4064
4065
4066 =head1 EXAMPLE: Run the same command 10 times
4067
4068 If you want to run the same command with the same arguments 10 times
4069 in parallel you can do:
4070
4071   seq 10 | parallel -n0 my_command my_args
4072
4073
4074 =head1 EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation
4075
4076 GNU B<parallel> can work similar to B<cat | sh>.
4077
4078 A resource inexpensive job is a job that takes very little CPU, disk
4079 I/O and network I/O. Ping is an example of a resource inexpensive
4080 job. wget is too - if the webpages are small.
4081
4082 The content of the file jobs_to_run:
4083
4084   ping -c 1 10.0.0.1
4085   wget http://example.com/status.cgi?ip=10.0.0.1
4086   ping -c 1 10.0.0.2
4087   wget http://example.com/status.cgi?ip=10.0.0.2
4088   ...
4089   ping -c 1 10.0.0.255
4090   wget http://example.com/status.cgi?ip=10.0.0.255
4091
4092 To run 100 processes simultaneously do:
4093
4094   parallel -j 100 < jobs_to_run
4095
4096 As there is not a I<command> the jobs will be evaluated by the shell.
4097
4098
4099 =head1 EXAMPLE: Processing a big file using more CPUs
4100
4101 To process a big file or some output you can use B<--pipe> to split up
4102 the data into blocks and pipe the blocks into the processing program.
4103
4104 If the program is B<gzip -9> you can do:
4105
4106   cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz
4107
4108 This will split B<bigfile> into blocks of 1 MB and pass that to B<gzip
4109 -9> in parallel. One B<gzip> will be run per CPU. The output of B<gzip
4110 -9> will be kept in order and saved to B<bigfile.gz>
4111
4112 B<gzip> works fine if the output is appended, but some processing does
4113 not work like that - for example sorting. For this GNU B<parallel> can
4114 put the output of each command into a file. This will sort a big file
4115 in parallel:
4116
4117   cat bigfile | parallel --pipe --files sort |\
4118     parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
4119
4120 Here B<bigfile> is split into blocks of around 1MB, each block ending
4121 in '\n' (which is the default for B<--recend>). Each block is passed
4122 to B<sort> and the output from B<sort> is saved into files. These
4123 files are passed to the second B<parallel> that runs B<sort -m> on the
4124 files before it removes the files. The output is saved to
4125 B<bigfile.sort>.
4126
4127 GNU B<parallel>'s B<--pipe> maxes out at around 100 MB/s because every
4128 byte has to be copied through GNU B<parallel>. But if B<bigfile> is a
4129 real (seekable) file GNU B<parallel> can by-pass the copying and send
4130 the parts directly to the program:
4131
4132   parallel --pipepart --block 100m -a bigfile --files sort |\
4133     parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
4134
4135
4136 =head1 EXAMPLE: Grouping input lines
4137
4138 When processing with B<--pipe> you may have lines grouped by a
4139 value. Here is I<my.csv>:
4140
4141    Transaction Customer Item
4142         1       a       53
4143         2       b       65
4144         3       b       82
4145         4       c       96
4146         5       c       67
4147         6       c       13
4148         7       d       90
4149         8       d       43
4150         9       d       91
4151         10      d       84
4152         11      e       72
4153         12      e       102
4154         13      e       63
4155         14      e       56
4156         15      e       74
4157
4158 Let us assume you want GNU B<parallel> to process each customer. In
4159 other words: You want all the transactions for a single customer to be
4160 treated as a single record.
4161
4162 To do this we preprocess the data with a program that inserts a record
4163 separator before each customer (column 2 = $F[1]). Here we first make
4164 a 50 character random string, which we then use as the separator:
4165
4166   sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'`
4167   cat my.csv | \
4168      perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \
4169      parallel --recend $sep --rrs --pipe -N1 wc
4170
4171 If your program can process multiple customers replace B<-N1> with a
4172 reasonable B<--blocksize>.
4173
4174
4175 =head1 EXAMPLE: Running more than 250 jobs workaround
4176
4177 If you need to run a massive amount of jobs in parallel, then you will
4178 likely hit the filehandle limit which is often around 250 jobs. If you
4179 are super user you can raise the limit in /etc/security/limits.conf
4180 but you can also use this workaround. The filehandle limit is per
4181 process. That means that if you just spawn more GNU B<parallel>s then
4182 each of them can run 250 jobs. This will spawn up to 2500 jobs:
4183
4184   cat myinput |\
4185     parallel --pipe -N 50 --roundrobin -j50 parallel -j50 your_prg
4186
4187 This will spawn up to 62500 jobs (use with caution - you need 64 GB
4188 RAM to do this, and you may need to increase /proc/sys/kernel/pid_max):
4189
4190   cat myinput |\
4191     parallel --pipe -N 250 --roundrobin -j250 parallel -j250 your_prg
4192
4193
4194 =head1 EXAMPLE: Working as mutex and counting semaphore
4195
4196 The command B<sem> is an alias for B<parallel --semaphore>.
4197
4198 A counting semaphore will allow a given number of jobs to be started
4199 in the background.  When the number of jobs are running in the
4200 background, GNU B<sem> will wait for one of these to complete before
4201 starting another command. B<sem --wait> will wait for all jobs to
4202 complete.
4203
4204 Run 10 jobs concurrently in the background:
4205
4206   for i in *.log ; do
4207     echo $i
4208     sem -j10 gzip $i ";" echo done
4209   done
4210   sem --wait
4211
4212 A mutex is a counting semaphore allowing only one job to run. This
4213 will edit the file I<myfile> and prepends the file with lines with the
4214 numbers 1 to 3.
4215
4216   seq 3 | parallel sem sed -i -e '1i{}' myfile
4217
4218 As I<myfile> can be very big it is important only one process edits
4219 the file at the same time.
4220
4221 Name the semaphore to have multiple different semaphores active at the
4222 same time:
4223
4224   seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile
4225
4226
4227 =head1 EXAMPLE: Mutex for a script
4228
4229 Assume a script is called from cron or from a web service, but only
4230 one instance can be run at a time. With B<sem> and B<--shebang-wrap>
4231 the script can be made to wait for other instances to finish. Here in
4232 B<bash>:
4233
4234   #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash
4235
4236   echo This will run
4237   sleep 5
4238   echo exclusively
4239
4240 Here B<perl>:
4241
4242   #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/perl
4243
4244   print "This will run ";
4245   sleep 5;
4246   print "exclusively\n";
4247
4248 Here B<python>:
4249
4250   #!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python
4251
4252   import time
4253   print "This will run ";
4254   time.sleep(5)
4255   print "exclusively";
4256
4257
4258 =head1 EXAMPLE: Start editor with filenames from stdin (standard input)
4259
4260 You can use GNU B<parallel> to start interactive programs like emacs or vi:
4261
4262   cat filelist | parallel --tty -X emacs
4263   cat filelist | parallel --tty -X vi
4264
4265 If there are more files than will fit on a single command line, the
4266 editor will be started again with the remaining files.
4267
4268
4269 =head1 EXAMPLE: Running sudo
4270
4271 B<sudo> requires a password to run a command as root. It caches the
4272 access, so you only need to enter the password again if you have not
4273 used B<sudo> for a while.
4274
4275 The command:
4276
4277   parallel sudo echo ::: This is a bad idea
4278
4279 is no good, as you would be prompted for the sudo password for each of
4280 the jobs. You can either do:
4281
4282   sudo echo This
4283   parallel sudo echo ::: is a good idea
4284
4285 or:
4286
4287   sudo parallel echo ::: This is a good idea
4288
4289 This way you only have to enter the sudo password once.
4290
4291
4292 =head1 EXAMPLE: GNU Parallel as queue system/batch manager
4293
4294 GNU B<parallel> can work as a simple job queue system or batch manager.
4295 The idea is to put the jobs into a file and have GNU B<parallel> read
4296 from that continuously. As GNU B<parallel> will stop at end of file we
4297 use B<tail> to continue reading:
4298
4299   true >jobqueue; tail -n+0 -f jobqueue | parallel
4300
4301 To submit your jobs to the queue:
4302
4303   echo my_command my_arg >> jobqueue
4304
4305 You can of course use B<-S> to distribute the jobs to remote
4306 computers:
4307
4308   true >jobqueue; tail -n+0 -f jobqueue | parallel -S ..
4309
4310 If you keep this running for a long time, jobqueue will grow. A way of
4311 removing the jobs already run is by making GNU B<parallel> stop when
4312 it hits a special value and then restart. To use B<--eof> to make GNU
4313 B<parallel> exit, B<tail> also needs to be forced to exit:
4314
4315   true >jobqueue;
4316   while true; do
4317     tail -n+0 -f jobqueue |
4318       (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
4319        perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
4320        (seq 1000 >> jobqueue &);
4321        echo Done appending dummy data forcing tail to exit)
4322     echo tail exited;
4323     mv j2 jobqueue
4324   done
4325
4326 In some cases you can run on more CPUs and computers during the night:
4327
4328   # Day time
4329   echo 50% > jobfile
4330   cp day_server_list ~/.parallel/sshloginfile
4331   # Night time
4332   echo 100% > jobfile
4333   cp night_server_list ~/.parallel/sshloginfile
4334   tail -n+0 -f jobqueue | parallel --jobs jobfile -S ..
4335
4336 GNU B<parallel> discovers if B<jobfile> or B<~/.parallel/sshloginfile>
4337 changes.
4338
4339 There is a a small issue when using GNU B<parallel> as queue
4340 system/batch manager: You have to submit JobSlot number of jobs before
4341 they will start, and after that you can submit one at a time, and job
4342 will start immediately if free slots are available.  Output from the
4343 running or completed jobs are held back and will only be printed when
4344 JobSlots more jobs has been started (unless you use --ungroup or
4345 --line-buffer, in which case the output from the jobs are printed
4346 immediately).  E.g. if you have 10 jobslots then the output from the
4347 first completed job will only be printed when job 11 has started, and
4348 the output of second completed job will only be printed when job 12
4349 has started.
4350
4351
4352 =head1 EXAMPLE: GNU Parallel as dir processor
4353
4354 If you have a dir in which users drop files that needs to be processed
4355 you can do this on GNU/Linux (If you know what B<inotifywait> is
4356 called on other platforms file a bug report):
4357
4358   inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
4359     parallel -u echo
4360
4361 This will run the command B<echo> on each file put into B<my_dir> or
4362 subdirs of B<my_dir>.
4363
4364 You can of course use B<-S> to distribute the jobs to remote
4365 computers:
4366
4367   inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
4368     parallel -S ..  -u echo
4369
4370 If the files to be processed are in a tar file then unpacking one file
4371 and processing it immediately may be faster than first unpacking all
4372 files. Set up the dir processor as above and unpack into the dir.
4373
4374 Using GNU B<parallel> as dir processor has the same limitations as
4375 using GNU B<parallel> as queue system/batch manager.
4376
4377
4378 =head1 EXAMPLE: Locate the missing package
4379
4380 If you have downloaded source and tried compiling it, you may have seen:
4381
4382   $ ./configure
4383   [...]
4384   checking for something.h... no
4385   configure: error: "libsomething not found"
4386
4387 Often it is not obvious which package you should install to get that
4388 file. Debian has `apt-file` to search for a file. `tracefile` from
4389 https://gitlab.com/ole.tange/tangetools can tell which files a program
4390 tried to access. In this case we are interested in one of the last
4391 files:
4392
4393   $ tracefile -un ./configure | tail | parallel -j0 apt-file search
4394
4395
4396 =head1 QUOTING
4397
4398 GNU B<parallel> is very liberal in quoting. You only need to quote
4399 characters that have special meaning in shell:
4400
4401   ( ) $ ` ' " < > ; | \
4402
4403 and depending on context these needs to be quoted, too:
4404
4405   ~ & # ! ? space * {
4406
4407 Therefore most people will never need more quoting than putting '\'
4408 in front of the special characters.
4409
4410 Often you can simply put \' around every ':
4411
4412   perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4413
4414 can be quoted:
4415
4416   parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\' ::: file
4417
4418 However, when you want to use a shell variable you need to quote the
4419 $-sign. Here is an example using $PARALLEL_SEQ. This variable is set
4420 by GNU B<parallel> itself, so the evaluation of the $ must be done by
4421 the sub shell started by GNU B<parallel>:
4422
4423   seq 10 | parallel -N2 echo seq:\$PARALLEL_SEQ arg1:{1} arg2:{2}
4424
4425 If the variable is set before GNU B<parallel> starts you can do this:
4426
4427   VAR=this_is_set_before_starting
4428   echo test | parallel echo {} $VAR
4429
4430 Prints: B<test this_is_set_before_starting>
4431
4432 It is a little more tricky if the variable contains more than one space in a row:
4433
4434   VAR="two  spaces  between  each  word"
4435   echo test | parallel echo {} \'"$VAR"\'
4436
4437 Prints: B<test two  spaces  between  each  word>
4438
4439 If the variable should not be evaluated by the shell starting GNU
4440 B<parallel> but be evaluated by the sub shell started by GNU
4441 B<parallel>, then you need to quote it:
4442
4443   echo test | parallel VAR=this_is_set_after_starting \; echo {} \$VAR
4444
4445 Prints: B<test this_is_set_after_starting>
4446
4447 It is a little more tricky if the variable contains space:
4448
4449   echo test |\
4450     parallel VAR='"two  spaces  between  each  word"' echo {} \'"$VAR"\'
4451
4452 Prints: B<test two  spaces  between  each  word>
4453
4454 $$ is the shell variable containing the process id of the shell. This
4455 will print the process id of the shell running GNU B<parallel>:
4456
4457   seq 10 | parallel echo $$
4458
4459 And this will print the process ids of the sub shells started by GNU
4460 B<parallel>.
4461
4462   seq 10 | parallel echo \$\$
4463
4464 If the special characters should not be evaluated by the sub shell
4465 then you need to protect it against evaluation from both the shell
4466 starting GNU B<parallel> and the sub shell:
4467
4468   echo test | parallel echo {} \\\$VAR
4469
4470 Prints: B<test $VAR>
4471
4472 GNU B<parallel> can protect against evaluation by the sub shell by
4473 using -q:
4474
4475   echo test | parallel -q echo {} \$VAR
4476
4477 Prints: B<test $VAR>
4478
4479 This is particularly useful if you have lots of quoting. If you want
4480 to run a perl script like this:
4481
4482   perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' file
4483
4484 It needs to be quoted like one of these:
4485
4486   ls | parallel perl -ne '/^\\S+\\s+\\S+\$/\ and\ print\ \$ARGV,\"\\n\"'
4487   ls | parallel perl -ne \''/^\S+\s+\S+$/ and print $ARGV,"\n"'\'
4488
4489 Notice how spaces, \'s, "'s, and $'s need to be quoted. GNU B<parallel>
4490 can do the quoting by using option -q:
4491
4492   ls | parallel -q  perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"'
4493
4494 However, this means you cannot make the sub shell interpret special
4495 characters. For example because of B<-q> this WILL NOT WORK:
4496
4497   ls *.gz | parallel -q "zcat {} >{.}"
4498   ls *.gz | parallel -q "zcat {} | bzip2 >{.}.bz2"
4499
4500 because > and | need to be interpreted by the sub shell.
4501
4502 If you get errors like:
4503
4504   sh: -c: line 0: syntax error near unexpected token
4505   sh: Syntax error: Unterminated quoted string
4506   sh: -c: line 0: unexpected EOF while looking for matching `''
4507   sh: -c: line 1: syntax error: unexpected end of file
4508   zsh:1: no matches found:
4509
4510 then you might try using B<-q>.
4511
4512 If you are using B<bash> process substitution like B<<(cat foo)> then
4513 you may try B<-q> and prepending I<command> with B<bash -c>:
4514
4515   ls | parallel -q bash -c 'wc -c <(echo {})'
4516
4517 Or for substituting output:
4518
4519   ls | parallel -q bash -c \
4520     'tar c {} | tee >(gzip >{}.tar.gz) | bzip2 >{}.tar.bz2'
4521
4522 B<Conclusion>: To avoid dealing with the quoting problems it may be
4523 easier just to write a small script or a function (remember to
4524 B<export -f> the function) and have GNU B<parallel> call that.
4525
4526
4527 =head1 LIST RUNNING JOBS
4528
4529 If you want a list of the jobs currently running you can run:
4530
4531   killall -USR1 parallel
4532
4533 GNU B<parallel> will then print the currently running jobs on stderr
4534 (standard error).
4535
4536
4537 =head1 COMPLETE RUNNING JOBS BUT DO NOT START NEW JOBS
4538
4539 If you regret starting a lot of jobs you can simply break GNU B<parallel>,
4540 but if you want to make sure you do not have half-completed jobs you
4541 should send the signal B<SIGHUP> to GNU B<parallel>:
4542
4543   killall -HUP parallel
4544
4545 This will tell GNU B<parallel> to not start any new jobs, but wait until
4546 the currently running jobs are finished before exiting.
4547
4548
4549 =head1 ENVIRONMENT VARIABLES
4550
4551 =over 9
4552
4553 =item $PARALLEL_HOME
4554
4555 Dir where GNU B<parallel> stores config files, semaphores, and caches
4556 information between invocations. Default: $HOME/.parallel.
4557
4558 =item $PARALLEL_PID
4559
4560 The environment variable $PARALLEL_PID is set by GNU B<parallel> and
4561 is visible to the jobs started from GNU B<parallel>. This makes it
4562 possible for the jobs to communicate directly to GNU B<parallel>.
4563 Remember to quote the $, so it gets evaluated by the correct
4564 shell.
4565
4566 B<Example:> If each of the jobs tests a solution and one of jobs finds
4567 the solution the job can tell GNU B<parallel> not to start more jobs
4568 by: B<kill -HUP $PARALLEL_PID>. This only works on the local
4569 computer.
4570
4571
4572 =item $PARALLEL_RSYNC_OPTS
4573
4574 Options to pass on to B<rsync>. Defaults to: -rlDzR.
4575
4576
4577 =item $PARALLEL_SHELL
4578
4579 Use this shell for the commands run by GNU B<parallel>:
4580
4581 =over 2
4582
4583 =item *
4584
4585 $PARALLEL_SHELL. If undefined use:
4586
4587 =item *
4588
4589 The shell that started GNU B<parallel>. If that cannot be determined:
4590
4591 =item *
4592
4593 $SHELL. If undefined use:
4594
4595 =item *
4596
4597 /bin/sh
4598
4599 =back
4600
4601
4602 =item $PARALLEL_SSH
4603
4604 GNU B<parallel> defaults to using B<ssh> for remote access. This can
4605 be overridden with $PARALLEL_SSH, which again can be overridden with
4606 B<--ssh>. It can also be set on a per server basis (see
4607 B<--sshlogin>).
4608
4609
4610 =item $PARALLEL_SEQ
4611
4612 $PARALLEL_SEQ will be set to the sequence number of the job
4613 running. Remember to quote the $, so it gets evaluated by the correct
4614 shell.
4615
4616 B<Example:>
4617
4618   seq 10 | parallel -N2 \
4619     echo seq:'$'PARALLEL_SEQ arg1:{1} arg2:{2}
4620
4621
4622 =item $PARALLEL_TMUX
4623
4624 Path to B<tmux>. If unset the B<tmux> in $PATH is used.
4625
4626
4627 =item $TMPDIR
4628
4629 Directory for temporary files. See: B<--tmpdir>.
4630
4631
4632 =item $PARALLEL
4633
4634 The environment variable $PARALLEL will be used as default options for
4635 GNU B<parallel>. If the variable contains special shell characters
4636 (e.g. $, *, or space) then these need to be to be escaped with \.
4637
4638 B<Example:>
4639
4640   cat list | parallel -j1 -k -v ls
4641   cat list | parallel -j1 -k -v -S"myssh user@server" ls
4642
4643 can be written as:
4644
4645   cat list | PARALLEL="-kvj1" parallel ls
4646   cat list | PARALLEL='-kvj1 -S myssh\ user@server' \
4647     parallel echo
4648
4649 Notice the \ in the middle is needed because 'myssh' and 'user@server'
4650 must be one argument.
4651
4652 =back
4653
4654
4655 =head1 DEFAULT PROFILE (CONFIG FILE)
4656
4657 The global configuration file /etc/parallel/config, followed by user
4658 configuration file ~/.parallel/config (formerly known as .parallelrc)
4659 will be read in turn if they exist.  Lines starting with '#' will be
4660 ignored. The format can follow that of the environment variable
4661 $PARALLEL, but it is often easier to simply put each option on its own
4662 line.
4663
4664 Options on the command line take precedence, followed by the
4665 environment variable $PARALLEL, user configuration file
4666 ~/.parallel/config, and finally the global configuration file
4667 /etc/parallel/config.
4668
4669 Note that no file that is read for options, nor the environment
4670 variable $PARALLEL, may contain retired options such as B<--tollef>.
4671
4672 =head1 PROFILE FILES
4673
4674 If B<--profile> set, GNU B<parallel> will read the profile from that
4675 file rather than the global or user configuration files. You can have
4676 multiple B<--profiles>.
4677
4678 Example: Profile for running a command on every sshlogin in
4679 ~/.ssh/sshlogins and prepend the output with the sshlogin:
4680
4681   echo --tag -S .. --nonall > ~/.parallel/n
4682   parallel -Jn uptime
4683
4684 Example: Profile for running every command with B<-j-1> and B<nice>
4685
4686   echo -j-1 nice > ~/.parallel/nice_profile
4687   parallel -J nice_profile bzip2 -9 ::: *
4688
4689 Example: Profile for running a perl script before every command:
4690
4691   echo "perl -e '\$a=\$\$; print \$a,\" \",'\$PARALLEL_SEQ',\" \";';" \
4692     > ~/.parallel/pre_perl
4693   parallel -J pre_perl echo ::: *
4694
4695 Note how the $ and " need to be quoted using \.
4696
4697 Example: Profile for running distributed jobs with B<nice> on the
4698 remote computers:
4699
4700   echo -S .. nice > ~/.parallel/dist
4701   parallel -J dist --trc {.}.bz2 bzip2 -9 ::: *
4702
4703
4704 =head1 EXIT STATUS
4705
4706 Exit status depends on B<--halt-on-error> if one of these is used:
4707 success=X, success=Y%, fail=Y%.
4708
4709 =over 6
4710
4711 =item Z<>0
4712
4713 All jobs ran without error. If success=X is used: X jobs ran without
4714 error. If success=Y% is used: Y% of the jobs ran without error.
4715
4716 =item Z<>1-100
4717
4718 Some of the jobs failed. The exit status gives the number of failed
4719 jobs. If Y% is used the exit status is the percentage of jobs that
4720 failed.
4721
4722 =item Z<>101
4723
4724 More than 100 jobs failed.
4725
4726 =item Z<>255
4727
4728 Other error.
4729
4730 =item Z<>-1 (In joblog and SQL table)
4731
4732 Killed by Ctrl-C, timeout, not enough memory or similar.
4733
4734 =item Z<>-2 (In joblog and SQL table)
4735
4736 skip() was called in B<{= =}>.
4737
4738 =item Z<>-1000 (In SQL table)
4739
4740 Job is ready to run (set by --sqlmaster).
4741
4742 =item Z<>-1220 (In SQL table)
4743
4744 Job is taken by worker (set by --sqlworker).
4745
4746 =back
4747
4748 If fail=1 is used, the exit status will be the exit status of the
4749 failing job.
4750
4751
4752 =head1 DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES
4753
4754 See: B<man parallel_alternatives>
4755
4756
4757 =head1 BUGS
4758
4759 =head2 Quoting of newline
4760
4761 Because of the way newline is quoted this will not work:
4762
4763   echo 1,2,3 | parallel -vkd, "echo 'a{}b'"
4764
4765 However, these will all work:
4766
4767   echo 1,2,3 | parallel -vkd, echo a{}b
4768   echo 1,2,3 | parallel -vkd, "echo 'a'{}'b'"
4769   echo 1,2,3 | parallel -vkd, "echo 'a'"{}"'b'"
4770
4771
4772 =head2 Speed
4773
4774 =head3 Startup
4775
4776 GNU B<parallel> is slow at starting up - around 250 ms the first time
4777 and 150 ms after that.
4778
4779 =head3 Job startup
4780
4781 Starting a job on the local machine takes around 10 ms. This can be a
4782 big overhead if the job takes very few ms to run. Often you can group
4783 small jobs together using B<-X> which will make the overhead less
4784 significant. Or you can run multiple GNU B<parallel>s as described in
4785 B<EXAMPLE: Speeding up fast jobs>.
4786
4787 =head3 SSH
4788
4789 When using multiple computers GNU B<parallel> opens B<ssh> connections
4790 to them to figure out how many connections can be used reliably
4791 simultaneously (Namely SSHD's MaxStartups). This test is done for each
4792 host in serial, so if your B<--sshloginfile> contains many hosts it may
4793 be slow.
4794
4795 If your jobs are short you may see that there are fewer jobs running
4796 on the remote systems than expected. This is due to time spent logging
4797 in and out. B<-M> may help here.
4798
4799 =head3 Disk access
4800
4801 A single disk can normally read data faster if it reads one file at a
4802 time instead of reading a lot of files in parallel, as this will avoid
4803 disk seeks. However, newer disk systems with multiple drives can read
4804 faster if reading from multiple files in parallel.
4805
4806 If the jobs are of the form read-all-compute-all-write-all, so
4807 everything is read before anything is written, it may be faster to
4808 force only one disk access at the time:
4809
4810   sem --id diskio cat file | compute | sem --id diskio cat > file
4811
4812 If the jobs are of the form read-compute-write, so writing starts
4813 before all reading is done, it may be faster to force only one reader
4814 and writer at the time:
4815
4816   sem --id read cat file | compute | sem --id write cat > file
4817
4818 If the jobs are of the form read-compute-read-compute, it may be
4819 faster to run more jobs in parallel than the system has CPUs, as some
4820 of the jobs will be stuck waiting for disk access.
4821
4822 =head2 --nice limits command length
4823
4824 The current implementation of B<--nice> is too pessimistic in the max
4825 allowed command length. It only uses a little more than half of what
4826 it could. This affects B<-X> and B<-m>. If this becomes a real problem for
4827 you, file a bug-report.
4828
4829 =head2 Aliases and functions do not work
4830
4831 If you get:
4832
4833   Can't exec "command": No such file or directory
4834
4835 or:
4836
4837   open3: exec of by command failed
4838
4839 or:
4840
4841   /bin/bash: command: command not found
4842
4843 it may be because I<command> is not known, but it could also be
4844 because I<command> is an alias or a function. If it is a function you
4845 need to B<export -f> the function first or use B<env_parallel>. An
4846 alias will only work if you use B<env_parallel>.
4847
4848 =head2 Database with MySQL fails randomly
4849
4850 The B<--sql*> options may fail randomly with MySQL. This problem does
4851 not exist with PostgreSQL.
4852
4853
4854 =head1 REPORTING BUGS
4855
4856 Report bugs to <bug-parallel@gnu.org> or
4857 https://savannah.gnu.org/bugs/?func=additem&group=parallel
4858
4859 See a perfect bug report on
4860 https://lists.gnu.org/archive/html/bug-parallel/2015-01/msg00000.html
4861
4862 Your bug report should always include:
4863
4864 =over 2
4865
4866 =item *
4867
4868 The error message you get (if any). If the error message is not from
4869 GNU B<parallel> you need to show why you think GNU B<parallel> caused
4870 these.
4871
4872 =item *
4873
4874 The complete output of B<parallel --version>. If you are not running
4875 the latest released version (see http://ftp.gnu.org/gnu/parallel/) you
4876 should specify why you believe the problem is not fixed in that
4877 version.
4878
4879 =item *
4880
4881 A minimal, complete, and verifiable example (See description on
4882 http://stackoverflow.com/help/mcve).
4883
4884 It should be a complete example that others can run that shows the problem
4885 including all files needed to run the example. This should preferably
4886 be small and simple, so try to remove as many options as possible. A
4887 combination of B<yes>, B<seq>, B<cat>, B<echo>, and B<sleep> can
4888 reproduce most errors. If your example requires large files, see if
4889 you can make them by something like B<seq 1000000> > B<file> or B<yes
4890 | head -n 10000000> > B<file>.
4891
4892 If your example requires remote execution, see if you can use
4893 B<localhost> - maybe using another login.
4894
4895 If you have access to a different system, test if the MCVE shows the
4896 problem on that system.
4897
4898 =item *
4899
4900 The output of your example. If your problem is not easily reproduced
4901 by others, the output might help them figure out the problem.
4902
4903 =item *
4904
4905 Whether you have watched the intro videos
4906 (http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1), walked
4907 through the tutorial (man parallel_tutorial), and read the EXAMPLE
4908 section in the man page (man parallel - search for EXAMPLE:).
4909
4910 =back
4911
4912 If you suspect the error is dependent on your environment or
4913 distribution, please see if you can reproduce the error on one of
4914 these VirtualBox images:
4915 http://sourceforge.net/projects/virtualboximage/files/
4916 http://www.osboxes.org/virtualbox-images/
4917
4918 Specifying the name of your distribution is not enough as you may have
4919 installed software that is not in the VirtualBox images.
4920
4921 If you cannot reproduce the error on any of the VirtualBox images
4922 above, see if you can build a VirtualBox image on which you can
4923 reproduce the error. If not you should assume the debugging will be
4924 done through you. That will put more burden on you and it is extra
4925 important you give any information that help. In general the problem
4926 will be fixed faster and with less work for you if you can reproduce
4927 the error on a VirtualBox.
4928
4929
4930 =head1 AUTHOR
4931
4932 When using GNU B<parallel> for a publication please cite:
4933
4934 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4935 The USENIX Magazine, February 2011:42-47.
4936
4937 This helps funding further development; and it won't cost you a cent.
4938 If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4939
4940 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4941
4942 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4943
4944 Copyright (C) 2010-2019 Ole Tange,
4945 http://ole.tange.dk and Free Software Foundation, Inc.
4946
4947 Parts of the manual concerning B<xargs> compatibility is inspired by
4948 the manual of B<xargs> from GNU findutils 4.4.2.
4949
4950
4951 =head1 LICENSE
4952
4953 This program is free software; you can redistribute it and/or modify
4954 it under the terms of the GNU General Public License as published by
4955 the Free Software Foundation; either version 3 of the License, or
4956 at your option any later version.
4957
4958 This program is distributed in the hope that it will be useful,
4959 but WITHOUT ANY WARRANTY; without even the implied warranty of
4960 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4961 GNU General Public License for more details.
4962
4963 You should have received a copy of the GNU General Public License
4964 along with this program.  If not, see <http://www.gnu.org/licenses/>.
4965
4966 =head2 Documentation license I
4967
4968 Permission is granted to copy, distribute and/or modify this documentation
4969 under the terms of the GNU Free Documentation License, Version 1.3 or
4970 any later version published by the Free Software Foundation; with no
4971 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
4972 Texts.  A copy of the license is included in the file fdl.txt.
4973
4974 =head2 Documentation license II
4975
4976 You are free:
4977
4978 =over 9
4979
4980 =item B<to Share>
4981
4982 to copy, distribute and transmit the work
4983
4984 =item B<to Remix>
4985
4986 to adapt the work
4987
4988 =back
4989
4990 Under the following conditions:
4991
4992 =over 9
4993
4994 =item B<Attribution>
4995
4996 You must attribute the work in the manner specified by the author or
4997 licensor (but not in any way that suggests that they endorse you or
4998 your use of the work).
4999
5000 =item B<Share Alike>
5001
5002 If you alter, transform, or build upon this work, you may distribute
5003 the resulting work only under the same, similar or a compatible
5004 license.
5005
5006 =back
5007
5008 With the understanding that:
5009
5010 =over 9
5011
5012 =item B<Waiver>
5013
5014 Any of the above conditions can be waived if you get permission from
5015 the copyright holder.
5016
5017 =item B<Public Domain>
5018
5019 Where the work or any of its elements is in the public domain under
5020 applicable law, that status is in no way affected by the license.
5021
5022 =item B<Other Rights>
5023
5024 In no way are any of the following rights affected by the license:
5025
5026 =over 2
5027
5028 =item *
5029
5030 Your fair dealing or fair use rights, or other applicable
5031 copyright exceptions and limitations;
5032
5033 =item *
5034
5035 The author's moral rights;
5036
5037 =item *
5038
5039 Rights other persons may have either in the work itself or in
5040 how the work is used, such as publicity or privacy rights.
5041
5042 =back
5043
5044 =back
5045
5046 =over 9
5047
5048 =item B<Notice>
5049
5050 For any reuse or distribution, you must make clear to others the
5051 license terms of this work.
5052
5053 =back
5054
5055 A copy of the full license is included in the file as cc-by-sa.txt.
5056
5057
5058 =head1 DEPENDENCIES
5059
5060 GNU B<parallel> uses Perl, and the Perl modules Getopt::Long,
5061 IPC::Open3, Symbol, IO::File, POSIX, and File::Temp. For remote usage
5062 it also uses rsync with ssh.
5063
5064
5065 =head1 SEE ALSO
5066
5067 B<ssh>(1), B<ssh-agent>(1), B<sshpass>(1), B<ssh-copy-id>(1),
5068 B<rsync>(1), B<find>(1), B<xargs>(1), B<dirname>(1), B<make>(1),
5069 B<pexec>(1), B<ppss>(1), B<xjobs>(1), B<prll>(1), B<dxargs>(1),
5070 B<mdm>(1)
5071
5072 =cut