src/parallel_tutorial.pod

   1 #!/usr/bin/perl -w
   2
   3 # SPDX-FileCopyrightText: 2021-2024 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
   4 # SPDX-License-Identifier: GFDL-1.3-or-later
   5 # SPDX-License-Identifier: CC-BY-SA-4.0
   6
   7 =head1 GNU Parallel Tutorial
   8
   9 This tutorial shows off much of GNU B<parallel>'s functionality. The
  10 tutorial is meant to learn the options in and syntax of GNU
  11 B<parallel>.  The tutorial is B<not> to show realistic examples from the
  12 real world.
  13
  14 =head2 Reader's guide
  15
  16 If you prefer reading a book buy B<GNU Parallel 2018> at
  17 https://www.lulu.com/shop/ole-tange/gnu-parallel-2018/paperback/product-23558902.html
  18 or download it at: https://doi.org/10.5281/zenodo.1146014
  19
  20 Otherwise start by watching the intro videos for a quick introduction:
  21 https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
  22
  23 Then browse through the examples (B<man parallel_examples>). That will give
  24 you an idea of what GNU B<parallel> is capable of.
  25
  26 If you want to dive even deeper: spend a couple of hours walking
  27 through the tutorial (B<man parallel_tutorial>). Your command line
  28 will love you for it.
  29
  30 Finally you may want to look at the rest of the manual (B<man
  31 parallel>) if you have special needs not already covered.
  32
  33 If you want to know the design decisions behind GNU B<parallel>, try:
  34 B<man parallel_design>. This is also a good intro if you intend to
  35 change GNU B<parallel>.
  36
  37
  38
  39 =head1 Prerequisites
  40
  41 To run this tutorial you must have the following:
  42
  43 =over 9
  44
  45 =item parallel >= version 20160822
  46
  47 Install the newest version using your package manager (recommended for
  48 security reasons), the way described in README, or with this command:
  49
  50   $ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
  51      fetch -o - http://pi.dk/3 ) > install.sh
  52   $ sha1sum install.sh
  53   12345678 51621b7f 1ee103c0 0783aae4 ef9889f8
  54   $ md5sum install.sh
  55   62eada78 703b5500 241b8e50 baf62758
  56   $ sha512sum install.sh
  57   160d3159 9480cf5c a101512f 150b7ac0 206a65dc 86f2bb6b bdf1a2bc 96bc6d06
  58   7f8237c2 0964b67f bccf8a93 332528fa 11e5ab43 2a6226a6 ceb197ab 7f03c061
  59   $ bash install.sh
  60
  61 This will also install the newest version of the tutorial which you
  62 can see by running this:
  63
  64   man parallel_tutorial
  65
  66 Most of the tutorial will work on older versions, too.
  67
  68
  69 =item abc-file:
  70
  71 The file can be generated by this command:
  72
  73   parallel -k echo ::: A B C > abc-file
  74
  75 =item def-file:
  76
  77 The file can be generated by this command:
  78
  79   parallel -k echo ::: D E F > def-file
  80
  81 =item abc0-file:
  82
  83 The file can be generated by this command:
  84
  85   perl -e 'printf "A\0B\0C\0"' > abc0-file
  86
  87 =item abc_-file:
  88
  89 The file can be generated by this command:
  90
  91   perl -e 'printf "A_B_C_"' > abc_-file
  92
  93 =item tsv-file.tsv
  94
  95 The file can be generated by this command:
  96
  97   perl -e 'printf "f1\tf2\nA\tB\nC\tD\n"' > tsv-file.tsv
  98
  99 =item num8
 100
 101 The file can be generated by this command:
 102
 103   perl -e 'for(1..8){print "$_\n"}' > num8
 104
 105 =item num128
 106
 107 The file can be generated by this command:
 108
 109   perl -e 'for(1..128){print "$_\n"}' > num128
 110
 111 =item num30000
 112
 113 The file can be generated by this command:
 114
 115   perl -e 'for(1..30000){print "$_\n"}' > num30000
 116
 117 =item num1000000
 118
 119 The file can be generated by this command:
 120
 121   perl -e 'for(1..1000000){print "$_\n"}' > num1000000
 122
 123 =item num_%header
 124
 125 The file can be generated by this command:
 126
 127   (echo %head1; echo %head2; \
 128    perl -e 'for(1..10){print "$_\n"}') > num_%header
 129
 130 =item fixedlen
 131
 132 The file can be generated by this command:
 133
 134   perl -e 'print "HHHHAAABBBCCC"' > fixedlen
 135
 136 =item For remote running: ssh login on 2 servers with no password in
 137 $SERVER1 and $SERVER2 must work.
 138
 139   SERVER1=server.example.com
 140   SERVER2=server2.example.net
 141
 142 So you must be able to do this without entering a password:
 143
 144   ssh $SERVER1 echo works
 145   ssh $SERVER2 echo works
 146
 147 It can be setup by running B<ssh-keygen -t dsa; ssh-copy-id $SERVER1>
 148 and using an empty passphrase, or you can use B<ssh-agent>.
 149
 150 =back
 151
 152
 153 =head1 Input sources
 154
 155 GNU B<parallel> reads input from input sources. These can be files, the
 156 command line, and stdin (standard input or a pipe).
 157
 158 =head2 A single input source
 159
 160 Input can be read from the command line:
 161
 162   parallel echo ::: A B C
 163
 164 Output (the order may be different because the jobs are run in
 165 parallel):
 166
 167   A
 168   B
 169   C
 170
 171 The input source can be a file:
 172
 173   parallel -a abc-file echo
 174
 175 Output: Same as above.
 176
 177 STDIN (standard input) can be the input source:
 178
 179   cat abc-file | parallel echo
 180
 181 Output: Same as above.
 182
 183
 184 =head2 Multiple input sources
 185
 186 GNU B<parallel> can take multiple input sources given on the command
 187 line. GNU B<parallel> then generates all combinations of the input
 188 sources:
 189
 190   parallel echo ::: A B C ::: D E F
 191
 192 Output (the order may be different):
 193
 194   A D
 195   A E
 196   A F
 197   B D
 198   B E
 199   B F
 200   C D
 201   C E
 202   C F
 203
 204 The input sources can be files:
 205
 206   parallel -a abc-file -a def-file echo
 207
 208 Output: Same as above.
 209
 210 STDIN (standard input) can be one of the input sources using B<->:
 211
 212   cat abc-file | parallel -a - -a def-file echo
 213
 214 Output: Same as above.
 215
 216 Instead of B<-a> files can be given after B<::::>:
 217
 218   cat abc-file | parallel echo :::: - def-file
 219
 220 Output: Same as above.
 221
 222 ::: and :::: can be mixed:
 223
 224   parallel echo ::: A B C :::: def-file
 225
 226 Output: Same as above.
 227
 228 =head3 Linking arguments from input sources
 229
 230 With B<--link> you can link the input sources and get one argument
 231 from each input source:
 232
 233   parallel --link echo ::: A B C ::: D E F
 234
 235 Output (the order may be different):
 236
 237   A D
 238   B E
 239   C F
 240
 241 If one of the input sources is too short, its values will wrap:
 242
 243   parallel --link echo ::: A B C D E ::: F G
 244
 245 Output (the order may be different):
 246
 247   A F
 248   B G
 249   C F
 250   D G
 251   E F
 252
 253 For more flexible linking you can use B<:::+> and B<::::+>. They work
 254 like B<:::> and B<::::> except they link the previous input source to
 255 this input source.
 256
 257 This will link ABC to GHI:
 258
 259   parallel echo :::: abc-file :::+ G H I :::: def-file
 260
 261 Output (the order may be different):
 262
 263   A G D
 264   A G E
 265   A G F
 266   B H D
 267   B H E
 268   B H F
 269   C I D
 270   C I E
 271   C I F
 272
 273 This will link GHI to DEF:
 274
 275   parallel echo :::: abc-file ::: G H I ::::+ def-file
 276
 277 Output (the order may be different):
 278
 279   A G D
 280   A H E
 281   A I F
 282   B G D
 283   B H E
 284   B I F
 285   C G D
 286   C H E
 287   C I F
 288
 289 If one of the input sources is too short when using B<:::+> or
 290 B<::::+>, the rest will be ignored:
 291
 292   parallel echo ::: A B C D E :::+ F G
 293
 294 Output (the order may be different):
 295
 296   A F
 297   B G
 298
 299
 300 =head2 Changing the argument separator.
 301
 302 GNU B<parallel> can use other separators than B<:::> or B<::::>. This is
 303 typically useful if B<:::> or B<::::> is used in the command to run:
 304
 305   parallel --arg-sep ,, echo ,, A B C :::: def-file
 306
 307 Output (the order may be different):
 308
 309   A D
 310   A E
 311   A F
 312   B D
 313   B E
 314   B F
 315   C D
 316   C E
 317   C F
 318
 319 Changing the argument file separator:
 320
 321   parallel --arg-file-sep // echo ::: A B C // def-file
 322
 323 Output: Same as above.
 324
 325
 326 =head2 Changing the argument delimiter
 327
 328 GNU B<parallel> will normally treat a full line as a single argument: It
 329 uses B<\n> as argument delimiter. This can be changed with B<-d>:
 330
 331   parallel -d _ echo :::: abc_-file
 332
 333 Output (the order may be different):
 334
 335   A
 336   B
 337   C
 338
 339 NUL can be given as B<\0>:
 340
 341   parallel -d '\0' echo :::: abc0-file
 342
 343 Output: Same as above.
 344
 345 A shorthand for B<-d '\0'> is B<-0> (this will often be used to read files
 346 from B<find ... -print0>):
 347
 348   parallel -0 echo :::: abc0-file
 349
 350 Output: Same as above.
 351
 352 =head2 End-of-file value for input source
 353
 354 GNU B<parallel> can stop reading when it encounters a certain value:
 355
 356   parallel -E stop echo ::: A B stop C D
 357
 358 Output:
 359
 360   A
 361   B
 362
 363 =head2 Skipping empty lines
 364
 365 Using B<--no-run-if-empty> GNU B<parallel> will skip empty lines.
 366
 367   (echo 1; echo; echo 2) | parallel --no-run-if-empty echo
 368
 369 Output:
 370
 371   1
 372   2
 373
 374
 375 =head1 Building the command line
 376
 377 =head2 No command means arguments are commands
 378
 379 If no command is given after parallel the arguments themselves are
 380 treated as commands:
 381
 382   parallel ::: ls 'echo foo' pwd
 383
 384 Output (the order may be different):
 385
 386   [list of files in current dir]
 387   foo
 388   [/path/to/current/working/dir]
 389
 390 The command can be a script, a binary or a Bash function if the function is
 391 exported using B<export -f>:
 392
 393   # Only works in Bash
 394   my_func() {
 395     echo in my_func $1
 396   }
 397   export -f my_func
 398   parallel my_func ::: 1 2 3
 399
 400 Output (the order may be different):
 401
 402   in my_func 1
 403   in my_func 2
 404   in my_func 3
 405
 406 =head2 Replacement strings
 407
 408 =head3 The 7 predefined replacement strings
 409
 410 GNU B<parallel> has several replacement strings. If no replacement
 411 strings are used the default is to append B<{}>:
 412
 413   parallel echo ::: A/B.C
 414
 415 Output:
 416
 417   A/B.C
 418
 419 The default replacement string is B<{}>:
 420
 421   parallel echo {} ::: A/B.C
 422
 423 Output:
 424
 425   A/B.C
 426
 427 The replacement string B<{.}> removes the extension:
 428
 429   parallel echo {.} ::: A/B.C
 430
 431 Output:
 432
 433   A/B
 434
 435 The replacement string B<{/}> removes the path:
 436
 437   parallel echo {/} ::: A/B.C
 438
 439 Output:
 440
 441   B.C
 442
 443 The replacement string B<{//}> keeps only the path:
 444
 445   parallel echo {//} ::: A/B.C
 446
 447 Output:
 448
 449   A
 450
 451 The replacement string B<{/.}> removes the path and the extension:
 452
 453   parallel echo {/.} ::: A/B.C
 454
 455 Output:
 456
 457   B
 458
 459 The replacement string B<{#}> gives the job number:
 460
 461   parallel echo {#} ::: A B C
 462
 463 Output (the order may be different):
 464
 465   1
 466   2
 467   3
 468
 469 The replacement string B<{%}> gives the job slot number (between 1 and
 470 number of jobs to run in parallel):
 471
 472   parallel -j 2 echo {%} ::: A B C
 473
 474 Output (the order may be different and 1 and 2 may be swapped):
 475
 476   1
 477   2
 478   1
 479
 480 =head3 Changing the replacement strings
 481
 482 The replacement string B<{}> can be changed with B<-I>:
 483
 484   parallel -I ,, echo ,, ::: A/B.C
 485
 486 Output:
 487
 488   A/B.C
 489
 490 The replacement string B<{.}> can be changed with B<--extensionreplace>:
 491
 492   parallel --extensionreplace ,, echo ,, ::: A/B.C
 493
 494 Output:
 495
 496   A/B
 497
 498 The replacement string B<{/}> can be replaced with B<--basenamereplace>:
 499
 500   parallel --basenamereplace ,, echo ,, ::: A/B.C
 501
 502 Output:
 503
 504   B.C
 505
 506 The replacement string B<{//}> can be changed with B<--dirnamereplace>:
 507
 508   parallel --dirnamereplace ,, echo ,, ::: A/B.C
 509
 510 Output:
 511
 512   A
 513
 514 The replacement string B<{/.}> can be changed with B<--basenameextensionreplace>:
 515
 516   parallel --basenameextensionreplace ,, echo ,, ::: A/B.C
 517
 518 Output:
 519
 520   B
 521
 522 The replacement string B<{#}> can be changed with B<--seqreplace>:
 523
 524   parallel --seqreplace ,, echo ,, ::: A B C
 525
 526 Output (the order may be different):
 527
 528   1
 529   2
 530   3
 531
 532 The replacement string B<{%}> can be changed with B<--slotreplace>:
 533
 534   parallel -j2 --slotreplace ,, echo ,, ::: A B C
 535
 536 Output (the order may be different and 1 and 2 may be swapped):
 537
 538   1
 539   2
 540   1
 541
 542 =head3 Perl expression replacement string
 543
 544 When predefined replacement strings are not flexible enough a perl
 545 expression can be used instead. One example is to remove two
 546 extensions: foo.tar.gz becomes foo
 547
 548   parallel echo '{= s:\.[^.]+$::;s:\.[^.]+$::; =}' ::: foo.tar.gz
 549
 550 Output:
 551
 552   foo
 553
 554 In B<{= =}> you can access all of GNU B<parallel>'s internal functions
 555 and variables. A few are worth mentioning.
 556
 557 B<total_jobs()> returns the total number of jobs:
 558
 559   parallel echo Job {#} of {= '$_=total_jobs()' =} ::: {1..5}
 560
 561 Output:
 562
 563   Job 1 of 5
 564   Job 2 of 5
 565   Job 3 of 5
 566   Job 4 of 5
 567   Job 5 of 5
 568
 569 B<Q(...)> shell quotes the string:
 570
 571   parallel echo {} shell quoted is {= '$_=Q($_)' =} ::: '*/!#$'
 572
 573 Output:
 574
 575   */!#$ shell quoted is \*/\!\#\$
 576
 577 B<skip()> skips the job:
 578
 579   parallel echo {= 'if($_==3) { skip() }' =} ::: {1..5}
 580
 581 Output:
 582
 583   1
 584   2
 585   4
 586   5
 587
 588 B<@arg> contains the input source variables:
 589
 590   parallel echo {= 'if($arg[1]==$arg[2]) { skip() }' =} \
 591     ::: {1..3} ::: {1..3}
 592
 593 Output:
 594
 595   1 2
 596   1 3
 597   2 1
 598   2 3
 599   3 1
 600   3 2
 601
 602 If the strings B<{=> and B<=}> cause problems they can be replaced with B<--parens>:
 603
 604   parallel --parens ,,,, echo ',, s:\.[^.]+$::;s:\.[^.]+$::; ,,' \
 605     ::: foo.tar.gz
 606
 607 Output:
 608
 609   foo
 610
 611 To define a shorthand replacement string use B<--rpl>:
 612
 613   parallel --rpl '.. s:\.[^.]+$::;s:\.[^.]+$::;' echo '..' \
 614     ::: foo.tar.gz
 615
 616 Output: Same as above.
 617
 618 If the shorthand starts with B<{> it can be used as a positional
 619 replacement string, too:
 620
 621   parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{..}'
 622     ::: foo.tar.gz
 623
 624 Output: Same as above.
 625
 626 If the shorthand contains matching parenthesis the replacement string
 627 becomes a dynamic replacement string and the string in the parenthesis
 628 can be accessed as $$1. If there are multiple matching parenthesis,
 629 the matched strings can be accessed using $$2, $$3 and so on.
 630
 631 You can think of this as giving arguments to the replacement
 632 string. Here we give the argument B<.tar.gz> to the replacement string
 633 B<{%I<string>}> which removes I<string>:
 634
 635   parallel --rpl '{%(.+?)} s/$$1$//;' echo {%.tar.gz}.zip ::: foo.tar.gz
 636
 637 Output:
 638
 639   foo.zip
 640
 641 Here we give the two arguments B<tar.gz> and B<zip> to the replacement
 642 string B<{/I<string1>/I<string2>}> which replaces I<string1> with
 643 I<string2>:
 644
 645   parallel --rpl '{/(.+?)/(.*?)} s/$$1/$$2/;' echo {/tar.gz/zip} \
 646     ::: foo.tar.gz
 647
 648 Output:
 649
 650   foo.zip
 651
 652
 653 GNU B<parallel>'s 7 replacement strings are implemented as this:
 654
 655   --rpl '{} '
 656   --rpl '{#} $_=$job->seq()'
 657   --rpl '{%} $_=$job->slot()'
 658   --rpl '{/} s:.*/::'
 659   --rpl '{//} $Global::use{"File::Basename"} ||=
 660            eval "use File::Basename; 1;"; $_ = dirname($_);'
 661   --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
 662   --rpl '{.} s:\.[^/.]+$::'
 663
 664 =head3 Positional replacement strings
 665
 666 With multiple input sources the argument from the individual input
 667 sources can be accessed with S<< B<{>numberB<}> >>:
 668
 669   parallel echo {1} and {2} ::: A B ::: C D
 670
 671 Output (the order may be different):
 672
 673   A and C
 674   A and D
 675   B and C
 676   B and D
 677
 678 The positional replacement strings can also be modified using B</>, B<//>, B</.>, and  B<.>:
 679
 680   parallel echo /={1/} //={1//} /.={1/.} .={1.} ::: A/B.C D/E.F
 681
 682 Output (the order may be different):
 683
 684   /=B.C //=A /.=B .=A/B
 685   /=E.F //=D /.=E .=D/E
 686
 687 If a position is negative, it will refer to the input source counted
 688 from behind:
 689
 690   parallel echo 1={1} 2={2} 3={3} -1={-1} -2={-2} -3={-3} \
 691     ::: A B ::: C D ::: E F
 692
 693 Output (the order may be different):
 694
 695   1=A 2=C 3=E -1=E -2=C -3=A
 696   1=A 2=C 3=F -1=F -2=C -3=A
 697   1=A 2=D 3=E -1=E -2=D -3=A
 698   1=A 2=D 3=F -1=F -2=D -3=A
 699   1=B 2=C 3=E -1=E -2=C -3=B
 700   1=B 2=C 3=F -1=F -2=C -3=B
 701   1=B 2=D 3=E -1=E -2=D -3=B
 702   1=B 2=D 3=F -1=F -2=D -3=B
 703
 704
 705 =head3 Positional perl expression replacement string
 706
 707 To use a perl expression as a positional replacement string simply
 708 prepend the perl expression with number and space:
 709
 710   parallel echo '{=2 s:\.[^.]+$::;s:\.[^.]+$::; =} {1}' \
 711     ::: bar ::: foo.tar.gz
 712
 713 Output:
 714
 715   foo bar
 716
 717 If a shorthand defined using B<--rpl> starts with B<{> it can be used as
 718 a positional replacement string, too:
 719
 720   parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{2..} {1}' \
 721     ::: bar ::: foo.tar.gz
 722
 723 Output: Same as above.
 724
 725
 726 =head3 Input from columns
 727
 728 The columns in a file can be bound to positional replacement strings
 729 using B<--colsep>. Here the columns are separated by TAB (\t):
 730
 731   parallel --colsep '\t' echo 1={1} 2={2} :::: tsv-file.tsv
 732
 733 Output (the order may be different):
 734
 735   1=f1 2=f2
 736   1=A 2=B
 737   1=C 2=D
 738
 739 =head3 Header defined replacement strings
 740
 741 With B<--header> GNU B<parallel> will use the first value of the input
 742 source as the name of the replacement string. Only the non-modified
 743 version B<{}> is supported:
 744
 745   parallel --header : echo f1={f1} f2={f2} ::: f1 A B ::: f2 C D
 746
 747 Output (the order may be different):
 748
 749   f1=A f2=C
 750   f1=A f2=D
 751   f1=B f2=C
 752   f1=B f2=D
 753
 754 It is useful with B<--colsep> for processing files with TAB separated values:
 755
 756   parallel --header : --colsep '\t' echo f1={f1} f2={f2} \
 757     :::: tsv-file.tsv
 758
 759 Output (the order may be different):
 760
 761   f1=A f2=B
 762   f1=C f2=D
 763
 764 =head3 More pre-defined replacement strings with --plus
 765
 766 B<--plus> adds the replacement strings B<{+/} {+.} {+..} {+...} {..}  {...}
 767 {/..} {/...} {##}>. The idea being that B<{+foo}> matches the opposite of B<{foo}>
 768 and B<{}> = B<{+/}>/B<{/}> = B<{.}>.B<{+.}> = B<{+/}>/B<{/.}>.B<{+.}> = B<{..}>.B<{+..}> =
 769 B<{+/}>/B<{/..}>.B<{+..}> = B<{...}>.B<{+...}> = B<{+/}>/B<{/...}>.B<{+...}>.
 770
 771   parallel --plus echo {} ::: dir/sub/file.ex1.ex2.ex3
 772   parallel --plus echo {+/}/{/} ::: dir/sub/file.ex1.ex2.ex3
 773   parallel --plus echo {.}.{+.} ::: dir/sub/file.ex1.ex2.ex3
 774   parallel --plus echo {+/}/{/.}.{+.} ::: dir/sub/file.ex1.ex2.ex3
 775   parallel --plus echo {..}.{+..} ::: dir/sub/file.ex1.ex2.ex3
 776   parallel --plus echo {+/}/{/..}.{+..} ::: dir/sub/file.ex1.ex2.ex3
 777   parallel --plus echo {...}.{+...} ::: dir/sub/file.ex1.ex2.ex3
 778   parallel --plus echo {+/}/{/...}.{+...} ::: dir/sub/file.ex1.ex2.ex3
 779
 780 Output:
 781
 782   dir/sub/file.ex1.ex2.ex3
 783
 784 B<{##}> is simply the number of jobs:
 785
 786   parallel --plus echo Job {#} of {##} ::: {1..5}
 787
 788 Output:
 789
 790   Job 1 of 5
 791   Job 2 of 5
 792   Job 3 of 5
 793   Job 4 of 5
 794   Job 5 of 5
 795
 796 =head3 Dynamic replacement strings with --plus
 797
 798 B<--plus> also defines these dynamic replacement strings:
 799
 800 =over 19
 801
 802 =item B<{:-I<string>}>
 803
 804 Default value is I<string> if the argument is empty.
 805
 806 =item B<{:I<number>}>
 807
 808 Substring from I<number> till end of string.
 809
 810 =item B<{:I<number1>:I<number2>}>
 811
 812 Substring from I<number1> to I<number2>.
 813
 814 =item B<{#I<string>}>
 815
 816 If the argument starts with I<string>, remove it.
 817
 818 =item B<{%I<string>}>
 819
 820 If the argument ends with I<string>, remove it.
 821
 822 =item B<{/I<string1>/I<string2>}>
 823
 824 Replace I<string1> with I<string2>.
 825
 826 =item B<{^I<string>}>
 827
 828 If the argument starts with I<string>, upper case it. I<string> must
 829 be a single letter.
 830
 831 =item B<{^^I<string>}>
 832
 833 If the argument contains I<string>, upper case it. I<string> must be a
 834 single letter.
 835
 836 =item B<{,I<string>}>
 837
 838 If the argument starts with I<string>, lower case it. I<string> must
 839 be a single letter.
 840
 841 =item B<{,,I<string>}>
 842
 843 If the argument contains I<string>, lower case it. I<string> must be a
 844 single letter.
 845
 846 =back
 847
 848 They are inspired from B<Bash>:
 849
 850   unset myvar
 851   echo ${myvar:-myval}
 852   parallel --plus echo {:-myval} ::: "$myvar"
 853
 854   myvar=abcAaAdef
 855   echo ${myvar:2}
 856   parallel --plus echo {:2} ::: "$myvar"
 857
 858   echo ${myvar:2:3}
 859   parallel --plus echo {:2:3} ::: "$myvar"
 860
 861   echo ${myvar#bc}
 862   parallel --plus echo {#bc} ::: "$myvar"
 863   echo ${myvar#abc}
 864   parallel --plus echo {#abc} ::: "$myvar"
 865
 866   echo ${myvar%de}
 867   parallel --plus echo {%de} ::: "$myvar"
 868   echo ${myvar%def}
 869   parallel --plus echo {%def} ::: "$myvar"
 870
 871   echo ${myvar/def/ghi}
 872   parallel --plus echo {/def/ghi} ::: "$myvar"
 873
 874   echo ${myvar^a}
 875   parallel --plus echo {^a} ::: "$myvar"
 876   echo ${myvar^^a}
 877   parallel --plus echo {^^a} ::: "$myvar"
 878
 879   myvar=AbcAaAdef
 880   echo ${myvar,A}
 881   parallel --plus echo '{,A}' ::: "$myvar"
 882   echo ${myvar,,A}
 883   parallel --plus echo '{,,A}' ::: "$myvar"
 884
 885 Output:
 886
 887   myval
 888   myval
 889   cAaAdef
 890   cAaAdef
 891   cAa
 892   cAa
 893   abcAaAdef
 894   abcAaAdef
 895   AaAdef
 896   AaAdef
 897   abcAaAdef
 898   abcAaAdef
 899   abcAaA
 900   abcAaA
 901   abcAaAghi
 902   abcAaAghi
 903   AbcAaAdef
 904   AbcAaAdef
 905   AbcAAAdef
 906   AbcAAAdef
 907   abcAaAdef
 908   abcAaAdef
 909   abcaaadef
 910   abcaaadef
 911
 912
 913 =head2 More than one argument
 914
 915 With B<--xargs> GNU B<parallel> will fit as many arguments as possible on a
 916 single line:
 917
 918   cat num30000 | parallel --xargs echo | wc -l
 919
 920 Output (if you run this under Bash on GNU/Linux):
 921
 922   2
 923
 924 The 30000 arguments fitted on 2 lines.
 925
 926 The maximal length of a single line can be set with B<-s>. With a maximal
 927 line length of 10000 chars 17 commands will be run:
 928
 929   cat num30000 | parallel --xargs -s 10000 echo | wc -l
 930
 931 Output:
 932
 933   17
 934
 935 For better parallelism GNU B<parallel> can distribute the arguments
 936 between all the parallel jobs when end of file is met.
 937
 938 Below GNU B<parallel> reads the last argument when generating the second
 939 job. When GNU B<parallel> reads the last argument, it spreads all the
 940 arguments for the second job over 4 jobs instead, as 4 parallel jobs
 941 are requested.
 942
 943 The first job will be the same as the B<--xargs> example above, but the
 944 second job will be split into 4 evenly sized jobs, resulting in a
 945 total of 5 jobs:
 946
 947   cat num30000 | parallel --jobs 4 -m echo | wc -l
 948
 949 Output (if you run this under Bash on GNU/Linux):
 950
 951   5
 952
 953 This is even more visible when running 4 jobs with 10 arguments. The
 954 10 arguments are being spread over 4 jobs:
 955
 956   parallel --jobs 4 -m echo ::: 1 2 3 4 5 6 7 8 9 10
 957
 958 Output:
 959
 960   1 2 3
 961   4 5 6
 962   7 8 9
 963   10
 964
 965 A replacement string can be part of a word. B<-m> will not repeat the context:
 966
 967   parallel --jobs 4 -m echo pre-{}-post ::: A B C D E F G
 968
 969 Output (the order may be different):
 970
 971   pre-A B-post
 972   pre-C D-post
 973   pre-E F-post
 974   pre-G-post
 975
 976 To repeat the context use B<-X> which otherwise works like B<-m>:
 977
 978   parallel --jobs 4 -X echo pre-{}-post ::: A B C D E F G
 979
 980 Output (the order may be different):
 981
 982   pre-A-post pre-B-post
 983   pre-C-post pre-D-post
 984   pre-E-post pre-F-post
 985   pre-G-post
 986
 987 To limit the number of arguments use B<-N>:
 988
 989   parallel -N3 echo ::: A B C D E F G H
 990
 991 Output (the order may be different):
 992
 993   A B C
 994   D E F
 995   G H
 996
 997 B<-N> also sets the positional replacement strings:
 998
 999   parallel -N3 echo 1={1} 2={2} 3={3} ::: A B C D E F G H
1000
1001 Output (the order may be different):
1002
1003   1=A 2=B 3=C
1004   1=D 2=E 3=F
1005   1=G 2=H 3=
1006
1007 B<-N0> reads 1 argument but inserts none:
1008
1009   parallel -N0 echo foo ::: 1 2 3
1010
1011 Output:
1012
1013   foo
1014   foo
1015   foo
1016
1017 =head2 Quoting
1018
1019 Command lines that contain special characters may need to be protected from the shell.
1020
1021 The B<perl> program B<print "@ARGV\n"> basically works like B<echo>.
1022
1023   perl -e 'print "@ARGV\n"' A
1024
1025 Output:
1026
1027   A
1028
1029 To run that in parallel the command needs to be quoted:
1030
1031   parallel perl -e 'print "@ARGV\n"' ::: This wont work
1032
1033 Output:
1034
1035   [Nothing]
1036
1037 To quote the command use B<-q>:
1038
1039   parallel -q perl -e 'print "@ARGV\n"' ::: This works
1040
1041 Output (the order may be different):
1042
1043   This
1044   works
1045
1046 Or you can quote the critical part using B<\'>:
1047
1048   parallel perl -e \''print "@ARGV\n"'\' ::: This works, too
1049
1050 Output (the order may be different):
1051
1052   This
1053   works,
1054   too
1055
1056 GNU B<parallel> can also \-quote full lines. Simply run this:
1057
1058   parallel --shellquote
1059   Warning: Input is read from the terminal. You either know what you
1060   Warning: are doing (in which case: YOU ARE AWESOME!) or you forgot
1061   Warning: ::: or :::: or to pipe data into parallel. If so
1062   Warning: consider going through the tutorial: man parallel_tutorial
1063   Warning: Press CTRL-D to exit.
1064   perl -e 'print "@ARGV\n"'
1065   [CTRL-D]
1066
1067 Output:
1068
1069   perl\ -e\ \'print\ \"@ARGV\\n\"\'
1070
1071 This can then be used as the command:
1072
1073   parallel perl\ -e\ \'print\ \"@ARGV\\n\"\' ::: This also works
1074
1075 Output (the order may be different):
1076
1077   This
1078   also
1079   works
1080
1081
1082 =head2 Trimming space
1083
1084 Space can be trimmed on the arguments using B<--trim>:
1085
1086   parallel --trim r echo pre-{}-post ::: ' A '
1087
1088 Output:
1089
1090   pre- A-post
1091
1092 To trim on the left side:
1093
1094   parallel --trim l echo pre-{}-post ::: ' A '
1095
1096 Output:
1097
1098   pre-A -post
1099
1100 To trim on the both sides:
1101
1102   parallel --trim lr echo pre-{}-post ::: ' A '
1103
1104 Output:
1105
1106   pre-A-post
1107
1108
1109 =head2 Respecting the shell
1110
1111 This tutorial uses Bash as the shell. GNU B<parallel> respects which
1112 shell you are using, so in B<zsh> you can do:
1113
1114   parallel echo \={} ::: zsh bash ls
1115
1116 Output:
1117
1118   /usr/bin/zsh
1119   /bin/bash
1120   /bin/ls
1121
1122 In B<csh> you can do:
1123
1124   parallel 'set a="{}"; if( { test -d "$a" } ) echo "$a is a dir"' ::: *
1125
1126 Output:
1127
1128   [somedir] is a dir
1129
1130 This also becomes useful if you use GNU B<parallel> in a shell script:
1131 GNU B<parallel> will use the same shell as the shell script.
1132
1133
1134 =head1 Controlling the output
1135
1136 The output can prefixed with the argument:
1137
1138   parallel --tag echo foo-{} ::: A B C
1139
1140 Output (the order may be different):
1141
1142   A       foo-A
1143   B       foo-B
1144   C       foo-C
1145
1146 To prefix it with another string use B<--tagstring>:
1147
1148   parallel --tagstring {}-bar echo foo-{} ::: A B C
1149
1150 Output (the order may be different):
1151
1152   A-bar   foo-A
1153   B-bar   foo-B
1154   C-bar   foo-C
1155
1156 To see what commands will be run without running them use B<--dryrun>:
1157
1158   parallel --dryrun echo {} ::: A B C
1159
1160 Output (the order may be different):
1161
1162   echo A
1163   echo B
1164   echo C
1165
1166 To print the command before running them use B<--verbose>:
1167
1168   parallel --verbose echo {} ::: A B C
1169
1170 Output (the order may be different):
1171
1172   echo A
1173   echo B
1174   A
1175   echo C
1176   B
1177   C
1178
1179 GNU B<parallel> will postpone the output until the command completes:
1180
1181   parallel -j2 'printf "%s-start\n%s" {} {};
1182     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1183
1184 Output:
1185
1186   2-start
1187   2-middle
1188   2-end
1189   1-start
1190   1-middle
1191   1-end
1192   4-start
1193   4-middle
1194   4-end
1195
1196 To get the output immediately use B<--ungroup>:
1197
1198   parallel -j2 --ungroup 'printf "%s-start\n%s" {} {};
1199     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1200
1201 Output:
1202
1203   4-start
1204   42-start
1205   2-middle
1206   2-end
1207   1-start
1208   1-middle
1209   1-end
1210   -middle
1211   4-end
1212
1213 B<--ungroup> is fast, but can cause half a line from one job to be mixed
1214 with half a line of another job. That has happened in the second line,
1215 where the line '4-middle' is mixed with '2-start'.
1216
1217 To avoid this use B<--linebuffer>:
1218
1219   parallel -j2 --linebuffer 'printf "%s-start\n%s" {} {};
1220     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1221
1222 Output:
1223
1224   4-start
1225   2-start
1226   2-middle
1227   2-end
1228   1-start
1229   1-middle
1230   1-end
1231   4-middle
1232   4-end
1233
1234 To force the output in the same order as the arguments use B<--keep-order>/B<-k>:
1235
1236   parallel -j2 -k 'printf "%s-start\n%s" {} {};
1237     sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
1238
1239 Output:
1240
1241   4-start
1242   4-middle
1243   4-end
1244   2-start
1245   2-middle
1246   2-end
1247   1-start
1248   1-middle
1249   1-end
1250
1251
1252 =head2 Saving output into files
1253
1254 GNU B<parallel> can save the output of each job into files:
1255
1256   parallel --files echo ::: A B C
1257
1258 Output will be similar to this:
1259
1260   /tmp/pAh6uWuQCg.par
1261   /tmp/opjhZCzAX4.par
1262   /tmp/W0AT_Rph2o.par
1263
1264 By default GNU B<parallel> will cache the output in files in B</tmp>. This
1265 can be changed by setting B<$TMPDIR> or B<--tmpdir>:
1266
1267   parallel --tmpdir /var/tmp --files echo ::: A B C
1268
1269 Output will be similar to this:
1270
1271   /var/tmp/N_vk7phQRc.par
1272   /var/tmp/7zA4Ccf3wZ.par
1273   /var/tmp/LIuKgF_2LP.par
1274
1275 Or:
1276
1277   TMPDIR=/var/tmp parallel --files echo ::: A B C
1278
1279 Output: Same as above.
1280
1281 The output files can be saved in a structured way using B<--results>:
1282
1283   parallel --results outdir echo ::: A B C
1284
1285 Output:
1286
1287   A
1288   B
1289   C
1290
1291 These files were also generated containing the standard output
1292 (stdout), standard error (stderr), and the sequence number (seq):
1293
1294   outdir/1/A/seq
1295   outdir/1/A/stderr
1296   outdir/1/A/stdout
1297   outdir/1/B/seq
1298   outdir/1/B/stderr
1299   outdir/1/B/stdout
1300   outdir/1/C/seq
1301   outdir/1/C/stderr
1302   outdir/1/C/stdout
1303
1304 B<--header :> will take the first value as name and use that in the
1305 directory structure. This is useful if you are using multiple input
1306 sources:
1307
1308   parallel --header : --results outdir echo ::: f1 A B ::: f2 C D
1309
1310 Generated files:
1311
1312   outdir/f1/A/f2/C/seq
1313   outdir/f1/A/f2/C/stderr
1314   outdir/f1/A/f2/C/stdout
1315   outdir/f1/A/f2/D/seq
1316   outdir/f1/A/f2/D/stderr
1317   outdir/f1/A/f2/D/stdout
1318   outdir/f1/B/f2/C/seq
1319   outdir/f1/B/f2/C/stderr
1320   outdir/f1/B/f2/C/stdout
1321   outdir/f1/B/f2/D/seq
1322   outdir/f1/B/f2/D/stderr
1323   outdir/f1/B/f2/D/stdout
1324
1325 The directories are named after the variables and their values.
1326
1327 =head1 Controlling the execution
1328
1329 =head2 Number of simultaneous jobs
1330
1331 The number of concurrent jobs is given with B<--jobs>/B<-j>:
1332
1333   /usr/bin/time parallel -N0 -j64 sleep 1 :::: num128
1334
1335 With 64 jobs in parallel the 128 B<sleep>s will take 2-8 seconds to run -
1336 depending on how fast your machine is.
1337
1338 By default B<--jobs> is the same as the number of CPU cores. So this:
1339
1340   /usr/bin/time parallel -N0 sleep 1 :::: num128
1341
1342 should take twice the time of running 2 jobs per CPU core:
1343
1344   /usr/bin/time parallel -N0 --jobs 200% sleep 1 :::: num128
1345
1346 B<--jobs 0> will run as many jobs in parallel as possible:
1347
1348   /usr/bin/time parallel -N0 --jobs 0 sleep 1 :::: num128
1349
1350 which should take 1-7 seconds depending on how fast your machine is.
1351
1352 B<--jobs> can read from a file which is re-read when a job finishes:
1353
1354   echo 50% > my_jobs
1355   /usr/bin/time parallel -N0 --jobs my_jobs sleep 1 :::: num128 &
1356   sleep 1
1357   echo 0 > my_jobs
1358   wait
1359
1360 The first second only 50% of the CPU cores will run a job. Then B<0> is
1361 put into B<my_jobs> and then the rest of the jobs will be started in
1362 parallel.
1363
1364 Instead of basing the percentage on the number of CPU cores
1365 GNU B<parallel> can base it on the number of CPUs:
1366
1367   parallel --use-cpus-instead-of-cores -N0 sleep 1 :::: num8
1368
1369 =head2 Shuffle job order
1370
1371 If you have many jobs (e.g. by multiple combinations of input
1372 sources), it can be handy to shuffle the jobs, so you get different
1373 values run. Use B<--shuf> for that:
1374
1375   parallel --shuf echo ::: 1 2 3 ::: a b c ::: A B C
1376
1377 Output:
1378
1379   All combinations but different order for each run.
1380
1381 =head2 Interactivity
1382
1383 GNU B<parallel> can ask the user if a command should be run using B<--interactive>:
1384
1385   parallel --interactive echo ::: 1 2 3
1386
1387 Output:
1388
1389   echo 1 ?...y
1390   echo 2 ?...n
1391   1
1392   echo 3 ?...y
1393   3
1394
1395 GNU B<parallel> can be used to put arguments on the command line for an
1396 interactive command such as B<emacs> to edit one file at a time:
1397
1398   parallel --tty emacs ::: 1 2 3
1399
1400 Or give multiple argument in one go to open multiple files:
1401
1402   parallel -X --tty vi ::: 1 2 3
1403
1404 =head2 A terminal for every job
1405
1406 Using B<--tmux> GNU B<parallel> can start a terminal for every job run:
1407
1408   seq 10 20 | parallel --tmux 'echo start {}; sleep {}; echo done {}'
1409
1410 This will tell you to run something similar to:
1411
1412   tmux -S /tmp/tmsrPrO0 attach
1413
1414 Using normal B<tmux> keystrokes (CTRL-b n or CTRL-b p) you can cycle
1415 between windows of the running jobs. When a job is finished it will
1416 pause for 10 seconds before closing the window.
1417
1418 =head2 Timing
1419
1420 Some jobs do heavy I/O when they start. To avoid a thundering herd GNU
1421 B<parallel> can delay starting new jobs. B<--delay> I<X> will make
1422 sure there is at least I<X> seconds between each start:
1423
1424   parallel --delay 2.5 echo Starting {}\;date ::: 1 2 3
1425
1426 Output:
1427
1428   Starting 1
1429   Thu Aug 15 16:24:33 CEST 2013
1430   Starting 2
1431   Thu Aug 15 16:24:35 CEST 2013
1432   Starting 3
1433   Thu Aug 15 16:24:38 CEST 2013
1434
1435
1436 If jobs taking more than a certain amount of time are known to fail,
1437 they can be stopped with B<--timeout>. The accuracy of B<--timeout> is
1438 2 seconds:
1439
1440   parallel --timeout 4.1 sleep {}\; echo {} ::: 2 4 6 8
1441
1442 Output:
1443
1444   2
1445   4
1446
1447 GNU B<parallel> can compute the median runtime for jobs and kill those
1448 that take more than 200% of the median runtime:
1449
1450   parallel --timeout 200% sleep {}\; echo {} ::: 2.1 2.2 3 7 2.3
1451
1452 Output:
1453
1454   2.1
1455   2.2
1456   3
1457   2.3
1458
1459 =head2 Progress information
1460
1461 Based on the runtime of completed jobs GNU B<parallel> can estimate the
1462 total runtime:
1463
1464   parallel --eta sleep ::: 1 3 2 2 1 3 3 2 1
1465
1466 Output:
1467
1468   Computers / CPU cores / Max jobs to run
1469   1:local / 2 / 2
1470
1471   Computer:jobs running/jobs completed/%of started jobs/
1472     Average seconds to complete
1473   ETA: 2s 0left 1.11avg  local:0/9/100%/1.1s
1474
1475 GNU B<parallel> can give progress information with B<--progress>:
1476
1477   parallel --progress sleep ::: 1 3 2 2 1 3 3 2 1
1478
1479 Output:
1480
1481   Computers / CPU cores / Max jobs to run
1482   1:local / 2 / 2
1483
1484   Computer:jobs running/jobs completed/%of started jobs/
1485     Average seconds to complete
1486   local:0/9/100%/1.1s
1487
1488 A progress bar can be shown with B<--bar>:
1489
1490   parallel --bar sleep ::: 1 3 2 2 1 3 3 2 1
1491
1492 And a graphic bar can be shown with B<--bar> and B<zenity>:
1493
1494   seq 1000 | parallel -j10 --bar '(echo -n {};sleep 0.1)' \
1495     2> >(perl -pe 'BEGIN{$/="\r";$|=1};s/\r/\n/g' |
1496          zenity --progress --auto-kill --auto-close)
1497
1498 A logfile of the jobs completed so far can be generated with B<--joblog>:
1499
1500   parallel --joblog /tmp/log exit  ::: 1 2 3 0
1501   cat /tmp/log
1502
1503 Output:
1504
1505   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
1506   1   :    1376577364.974 0.008   0    0       1       0      exit 1
1507   2   :    1376577364.982 0.013   0    0       2       0      exit 2
1508   3   :    1376577364.990 0.013   0    0       3       0      exit 3
1509   4   :    1376577365.003 0.003   0    0       0       0      exit 0
1510
1511 The log contains the job sequence, which host the job was run on, the
1512 start time and run time, how much data was transferred, the exit
1513 value, the signal that killed the job, and finally the command being
1514 run.
1515
1516 With a joblog GNU B<parallel> can be stopped and later pickup where it
1517 left off. It it important that the input of the completed jobs is
1518 unchanged.
1519
1520   parallel --joblog /tmp/log exit  ::: 1 2 3 0
1521   cat /tmp/log
1522   parallel --resume --joblog /tmp/log exit  ::: 1 2 3 0 0 0
1523   cat /tmp/log
1524
1525 Output:
1526
1527   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
1528   1   :    1376580069.544 0.008   0    0       1       0      exit 1
1529   2   :    1376580069.552 0.009   0    0       2       0      exit 2
1530   3   :    1376580069.560 0.012   0    0       3       0      exit 3
1531   4   :    1376580069.571 0.005   0    0       0       0      exit 0
1532
1533   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
1534   1   :    1376580069.544 0.008   0    0       1       0      exit 1
1535   2   :    1376580069.552 0.009   0    0       2       0      exit 2
1536   3   :    1376580069.560 0.012   0    0       3       0      exit 3
1537   4   :    1376580069.571 0.005   0    0       0       0      exit 0
1538   5   :    1376580070.028 0.009   0    0       0       0      exit 0
1539   6   :    1376580070.038 0.007   0    0       0       0      exit 0
1540
1541 Note how the start time of the last 2 jobs is clearly different from the second run.
1542
1543 With B<--resume-failed> GNU B<parallel> will re-run the jobs that failed:
1544
1545   parallel --resume-failed --joblog /tmp/log exit  ::: 1 2 3 0 0 0
1546   cat /tmp/log
1547
1548 Output:
1549
1550   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
1551   1   :    1376580069.544 0.008   0    0       1       0      exit 1
1552   2   :    1376580069.552 0.009   0    0       2       0      exit 2
1553   3   :    1376580069.560 0.012   0    0       3       0      exit 3
1554   4   :    1376580069.571 0.005   0    0       0       0      exit 0
1555   5   :    1376580070.028 0.009   0    0       0       0      exit 0
1556   6   :    1376580070.038 0.007   0    0       0       0      exit 0
1557   1   :    1376580154.433 0.010   0    0       1       0      exit 1
1558   2   :    1376580154.444 0.022   0    0       2       0      exit 2
1559   3   :    1376580154.466 0.005   0    0       3       0      exit 3
1560
1561 Note how seq 1 2 3 have been repeated because they had exit value
1562 different from 0.
1563
1564 B<--retry-failed> does almost the same as B<--resume-failed>. Where
1565 B<--resume-failed> reads the commands from the command line (and
1566 ignores the commands in the joblog), B<--retry-failed> ignores the
1567 command line and reruns the commands mentioned in the joblog.
1568
1569   parallel --retry-failed --joblog /tmp/log
1570   cat /tmp/log
1571
1572 Output:
1573
1574   Seq Host Starttime      Runtime Send Receive Exitval Signal Command
1575   1   :    1376580069.544 0.008   0    0       1       0      exit 1
1576   2   :    1376580069.552 0.009   0    0       2       0      exit 2
1577   3   :    1376580069.560 0.012   0    0       3       0      exit 3
1578   4   :    1376580069.571 0.005   0    0       0       0      exit 0
1579   5   :    1376580070.028 0.009   0    0       0       0      exit 0
1580   6   :    1376580070.038 0.007   0    0       0       0      exit 0
1581   1   :    1376580154.433 0.010   0    0       1       0      exit 1
1582   2   :    1376580154.444 0.022   0    0       2       0      exit 2
1583   3   :    1376580154.466 0.005   0    0       3       0      exit 3
1584   1   :    1376580164.633 0.010   0    0       1       0      exit 1
1585   2   :    1376580164.644 0.022   0    0       2       0      exit 2
1586   3   :    1376580164.666 0.005   0    0       3       0      exit 3
1587
1588
1589 =head2 Termination
1590
1591 =head3 Unconditional termination
1592
1593 By default GNU B<parallel> will wait for all jobs to finish before exiting.
1594
1595 If you send GNU B<parallel> the B<TERM> signal, GNU B<parallel> will
1596 stop spawning new jobs and wait for the remaining jobs to finish. If
1597 you send GNU B<parallel> the B<TERM> signal again, GNU B<parallel>
1598 will kill all running jobs and exit.
1599
1600 =head3 Termination dependent on job status
1601
1602 For certain jobs there is no need to continue if one of the jobs fails
1603 and has an exit code different from 0. GNU B<parallel> will stop spawning new jobs
1604 with B<--halt soon,fail=1>:
1605
1606   parallel -j2 --halt soon,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
1607
1608 Output:
1609
1610   0
1611   0
1612   1
1613   parallel: This job failed:
1614   echo 1; exit 1
1615   parallel: Starting no more jobs. Waiting for 1 jobs to finish.
1616   2
1617
1618 With B<--halt now,fail=1> the running jobs will be killed immediately:
1619
1620   parallel -j2 --halt now,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
1621
1622 Output:
1623
1624   0
1625   0
1626   1
1627   parallel: This job failed:
1628   echo 1; exit 1
1629
1630 If B<--halt> is given a percentage this percentage of the jobs must fail
1631 before GNU B<parallel> stops spawning more jobs:
1632
1633   parallel -j2 --halt soon,fail=20% echo {}\; exit {} \
1634     ::: 0 1 2 3 4 5 6 7 8 9
1635
1636 Output:
1637
1638   0
1639   1
1640   parallel: This job failed:
1641   echo 1; exit 1
1642   2
1643   parallel: This job failed:
1644   echo 2; exit 2
1645   parallel: Starting no more jobs. Waiting for 1 jobs to finish.
1646   3
1647   parallel: This job failed:
1648   echo 3; exit 3
1649
1650 If you are looking for success instead of failures, you can use
1651 B<success>. This will finish as soon as the first job succeeds:
1652
1653   parallel -j2 --halt now,success=1 echo {}\; exit {} ::: 1 2 3 0 4 5 6
1654
1655 Output:
1656
1657   1
1658   2
1659   3
1660   0
1661   parallel: This job succeeded:
1662   echo 0; exit 0
1663
1664 GNU B<parallel> can retry the command with B<--retries>. This is useful if a
1665 command fails for unknown reasons now and then.
1666
1667   parallel -k --retries 3 \
1668     'echo tried {} >>/tmp/runs; echo completed {}; exit {}' ::: 1 2 0
1669   cat /tmp/runs
1670
1671 Output:
1672
1673   completed 1
1674   completed 2
1675   completed 0
1676
1677   tried 1
1678   tried 2
1679   tried 1
1680   tried 2
1681   tried 1
1682   tried 2
1683   tried 0
1684
1685 Note how job 1 and 2 were tried 3 times, but 0 was not retried because it had exit code 0.
1686
1687 =head3 Termination signals (advanced)
1688
1689 Using B<--termseq> you can control which signals are sent when killing
1690 children. Normally children will be killed by sending them B<SIGTERM>,
1691 waiting 200 ms, then another B<SIGTERM>, waiting 100 ms, then another
1692 B<SIGTERM>, waiting 50 ms, then a B<SIGKILL>, finally waiting 25 ms
1693 before giving up. It looks like this:
1694
1695   show_signals() {
1696     perl -e 'for(keys %SIG) {
1697         $SIG{$_} = eval "sub { print \"Got $_\\n\"; }";
1698       }
1699       while(1){sleep 1}'
1700   }
1701   export -f show_signals
1702   echo | parallel --termseq TERM,200,TERM,100,TERM,50,KILL,25 \
1703     -u --timeout 1 show_signals
1704
1705 Output:
1706
1707   Got TERM
1708   Got TERM
1709   Got TERM
1710
1711 Or just:
1712
1713   echo | parallel -u --timeout 1 show_signals
1714
1715 Output: Same as above.
1716
1717 You can change this to B<SIGINT>, B<SIGTERM>, B<SIGKILL>:
1718
1719   echo | parallel --termseq INT,200,TERM,100,KILL,25 \
1720     -u --timeout 1 show_signals
1721
1722 Output:
1723
1724   Got INT
1725   Got TERM
1726
1727 The B<SIGKILL> does not show because it cannot be caught, and thus the
1728 child dies.
1729
1730
1731 =head2 Limiting the resources
1732
1733 To avoid overloading systems GNU B<parallel> can look at the system load
1734 before starting another job:
1735
1736   parallel --load 100% echo load is less than {} job per cpu ::: 1
1737
1738 Output:
1739
1740   [when then load is less than the number of cpu cores]
1741   load is less than 1 job per cpu
1742
1743 GNU B<parallel> can also check if the system is swapping.
1744
1745   parallel --noswap echo the system is not swapping ::: now
1746
1747 Output:
1748
1749   [when then system is not swapping]
1750   the system is not swapping now
1751
1752 Some jobs need a lot of memory, and should only be started when there
1753 is enough memory free. Using B<--memfree> GNU B<parallel> can check if
1754 there is enough memory free. Additionally, GNU B<parallel> will kill
1755 off the youngest job if the memory free falls below 50% of the
1756 size. The killed job will put back on the queue and retried later.
1757
1758   parallel --memfree 1G echo will run if more than 1 GB is ::: free
1759
1760 GNU B<parallel> can run the jobs with a nice value. This will work both
1761 locally and remotely.
1762
1763   parallel --nice 17 echo this is being run with nice -n ::: 17
1764
1765 Output:
1766
1767   this is being run with nice -n 17
1768
1769 =head1 Remote execution
1770
1771 GNU B<parallel> can run jobs on remote servers. It uses B<ssh> to
1772 communicate with the remote machines.
1773
1774 =head2 Sshlogin
1775
1776 The most basic sshlogin is B<-S> I<host>:
1777
1778   parallel -S $SERVER1 echo running on ::: $SERVER1
1779
1780 Output:
1781
1782   running on [$SERVER1]
1783
1784 To use a different username prepend the server with I<username@>:
1785
1786   parallel -S username@$SERVER1 echo running on ::: username@$SERVER1
1787
1788 Output:
1789
1790   running on [username@$SERVER1]
1791
1792 The special sshlogin B<:> is the local machine:
1793
1794   parallel -S : echo running on ::: the_local_machine
1795
1796 Output:
1797
1798   running on the_local_machine
1799
1800 If B<ssh> is not in $PATH it can be prepended to $SERVER1:
1801
1802   parallel -S '/usr/bin/ssh '$SERVER1 echo custom ::: ssh
1803
1804 Output:
1805
1806   custom ssh
1807
1808 The B<ssh> command can also be given using B<--ssh>:
1809
1810   parallel --ssh /usr/bin/ssh -S $SERVER1 echo custom ::: ssh
1811
1812 or by setting B<$PARALLEL_SSH>:
1813
1814   export PARALLEL_SSH=/usr/bin/ssh
1815   parallel -S $SERVER1 echo custom ::: ssh
1816
1817 Several servers can be given using multiple B<-S>:
1818
1819   parallel -S $SERVER1 -S $SERVER2 echo ::: running on more hosts
1820
1821 Output (the order may be different):
1822
1823   running
1824   on
1825   more
1826   hosts
1827
1828 Or they can be separated by B<,>:
1829
1830   parallel -S $SERVER1,$SERVER2 echo ::: running on more hosts
1831
1832 Output: Same as above.
1833
1834 Or newline:
1835
1836   # This gives a \n between $SERVER1 and $SERVER2
1837   SERVERS="`echo $SERVER1; echo $SERVER2`"
1838   parallel -S "$SERVERS" echo ::: running on more hosts
1839
1840 They can also be read from a file (replace I<user@> with the user on B<$SERVER2>):
1841
1842   echo $SERVER1 > nodefile
1843   # Force 4 cores, special ssh-command, username
1844   echo 4//usr/bin/ssh user@$SERVER2 >> nodefile
1845   parallel --sshloginfile nodefile echo ::: running on more hosts
1846
1847 Output: Same as above.
1848
1849 Every time a job finished, the B<--sshloginfile> will be re-read, so
1850 it is possible to both add and remove hosts while running.
1851
1852 The special B<--sshloginfile ..> reads from B<~/.parallel/sshloginfile>.
1853
1854 To force GNU B<parallel> to treat a server having a given number of CPU
1855 cores prepend the number of core followed by B</> to the sshlogin:
1856
1857   parallel -S 4/$SERVER1 echo force {} cpus on server ::: 4
1858
1859 Output:
1860
1861   force 4 cpus on server
1862
1863 Servers can be put into groups by prepending I<@groupname> to the
1864 server and the group can then be selected by appending I<@groupname> to
1865 the argument if using B<--hostgroup>:
1866
1867   parallel --hostgroup -S @grp1/$SERVER1 -S @grp2/$SERVER2 echo {} \
1868     ::: run_on_grp1@grp1 run_on_grp2@grp2
1869
1870 Output:
1871
1872   run_on_grp1
1873   run_on_grp2
1874
1875 A host can be in multiple groups by separating the groups with B<+>, and
1876 you can force GNU B<parallel> to limit the groups on which the command
1877 can be run with B<-S> I<@groupname>:
1878
1879   parallel -S @grp1 -S @grp1+grp2/$SERVER1 -S @grp2/SERVER2 echo {} \
1880     ::: run_on_grp1 also_grp1
1881
1882 Output:
1883
1884   run_on_grp1
1885   also_grp1
1886
1887 =head2 Transferring files
1888
1889 GNU B<parallel> can transfer the files to be processed to the remote
1890 host. It does that using rsync.
1891
1892   echo This is input_file > input_file
1893   parallel -S $SERVER1 --transferfile {} cat ::: input_file
1894
1895 Output:
1896
1897   This is input_file
1898
1899 If the files are processed into another file, the resulting file can be
1900 transferred back:
1901
1902   echo This is input_file > input_file
1903   parallel -S $SERVER1 --transferfile {} --return {}.out \
1904     cat {} ">"{}.out ::: input_file
1905   cat input_file.out
1906
1907 Output: Same as above.
1908
1909 To remove the input and output file on the remote server use B<--cleanup>:
1910
1911   echo This is input_file > input_file
1912   parallel -S $SERVER1 --transferfile {} --return {}.out --cleanup \
1913     cat {} ">"{}.out ::: input_file
1914   cat input_file.out
1915
1916 Output: Same as above.
1917
1918 There is a shorthand for B<--transferfile {} --return --cleanup> called B<--trc>:
1919
1920   echo This is input_file > input_file
1921   parallel -S $SERVER1 --trc {}.out cat {} ">"{}.out ::: input_file
1922   cat input_file.out
1923
1924 Output: Same as above.
1925
1926 Some jobs need a common database for all jobs. GNU B<parallel> can
1927 transfer that using B<--basefile> which will transfer the file before the
1928 first job:
1929
1930   echo common data > common_file
1931   parallel --basefile common_file -S $SERVER1 \
1932     cat common_file\; echo {} ::: foo
1933
1934 Output:
1935
1936   common data
1937   foo
1938
1939 To remove it from the remote host after the last job use B<--cleanup>.
1940
1941
1942 =head2 Working dir
1943
1944 The default working dir on the remote machines is the login dir. This
1945 can be changed with B<--workdir> I<mydir>.
1946
1947 Files transferred using B<--transferfile> and B<--return> will be relative
1948 to I<mydir> on remote computers, and the command will be executed in
1949 the dir I<mydir>.
1950
1951 The special I<mydir> value B<...> will create working dirs under
1952 B<~/.parallel/tmp> on the remote computers. If B<--cleanup> is given
1953 these dirs will be removed.
1954
1955 The special I<mydir> value B<.> uses the current working dir.  If the
1956 current working dir is beneath your home dir, the value B<.> is
1957 treated as the relative path to your home dir. This means that if your
1958 home dir is different on remote computers (e.g. if your login is
1959 different) the relative path will still be relative to your home dir.
1960
1961   parallel -S $SERVER1 pwd ::: ""
1962   parallel --workdir . -S $SERVER1 pwd ::: ""
1963   parallel --workdir ... -S $SERVER1 pwd ::: ""
1964
1965 Output:
1966
1967   [the login dir on $SERVER1]
1968   [current dir relative on $SERVER1]
1969   [a dir in ~/.parallel/tmp/...]
1970
1971
1972 =head2 Avoid overloading sshd
1973
1974 If many jobs are started on the same server, B<sshd> can be
1975 overloaded. GNU B<parallel> can insert a delay between each job run on
1976 the same server:
1977
1978   parallel -S $SERVER1 --sshdelay 0.2 echo ::: 1 2 3
1979
1980 Output (the order may be different):
1981
1982   1
1983   2
1984   3
1985
1986 B<sshd> will be less overloaded if using B<--controlmaster>, which will
1987 multiplex ssh connections:
1988
1989   parallel --controlmaster -S $SERVER1 echo ::: 1 2 3
1990
1991 Output: Same as above.
1992
1993 =head2 Ignore hosts that are down
1994
1995 In clusters with many hosts a few of them are often down. GNU B<parallel>
1996 can ignore those hosts. In this case the host 173.194.32.46 is down:
1997
1998   parallel --filter-hosts -S 173.194.32.46,$SERVER1 echo ::: bar
1999
2000 Output:
2001
2002   bar
2003
2004 =head2 Running the same commands on all hosts
2005
2006 GNU B<parallel> can run the same command on all the hosts:
2007
2008   parallel --onall -S $SERVER1,$SERVER2 echo ::: foo bar
2009
2010 Output (the order may be different):
2011
2012   foo
2013   bar
2014   foo
2015   bar
2016
2017 Often you will just want to run a single command on all hosts with out
2018 arguments. B<--nonall> is a no argument B<--onall>:
2019
2020   parallel --nonall -S $SERVER1,$SERVER2 echo foo bar
2021
2022 Output:
2023
2024   foo bar
2025   foo bar
2026
2027 When B<--tag> is used with B<--nonall> and B<--onall> the B<--tagstring> is the host:
2028
2029   parallel --nonall --tag -S $SERVER1,$SERVER2 echo foo bar
2030
2031 Output (the order may be different):
2032
2033   $SERVER1 foo bar
2034   $SERVER2 foo bar
2035
2036 B<--jobs> sets the number of servers to log in to in parallel.
2037
2038 =head2 Transferring environment variables and functions
2039
2040 B<env_parallel> is a shell function that transfers all aliases,
2041 functions, variables, and arrays. You active it by running:
2042
2043   source `which env_parallel.bash`
2044
2045 Replace B<bash> with the shell you use.
2046
2047 Now you can use B<env_parallel> instead of B<parallel> and still have
2048 your environment:
2049
2050   alias myecho=echo
2051   myvar="Joe's var is"
2052   env_parallel -S $SERVER1 'myecho $myvar' ::: green
2053
2054 Output:
2055
2056   Joe's var is green
2057
2058 The disadvantage is that if your environment is huge B<env_parallel>
2059 will fail.
2060
2061 When B<env_parallel> fails, you can still use B<--env> to tell GNU
2062 B<parallel> to transfer an environment variable to the remote system.
2063
2064   MYVAR='foo bar'
2065   export MYVAR
2066   parallel --env MYVAR -S $SERVER1 echo '$MYVAR' ::: baz
2067
2068 Output:
2069
2070   foo bar baz
2071
2072 This works for functions, too, if your shell is Bash:
2073
2074   # This only works in Bash
2075   my_func() {
2076     echo in my_func $1
2077   }
2078   export -f my_func
2079   parallel --env my_func -S $SERVER1 my_func ::: baz
2080
2081 Output:
2082
2083   in my_func baz
2084
2085 GNU B<parallel> can copy all user defined variables and functions to
2086 the remote system. It just needs to record which ones to ignore in
2087 B<~/.parallel/ignored_vars>. Do that by running this once:
2088
2089   parallel --record-env
2090   cat ~/.parallel/ignored_vars
2091
2092 Output:
2093
2094   [list of variables to ignore - including $PATH and $HOME]
2095
2096 Now all other variables and functions defined will be copied when
2097 using B<--env _>.
2098
2099   # The function is only copied if using Bash
2100   my_func2() {
2101     echo in my_func2 $VAR $1
2102   }
2103   export -f my_func2
2104   VAR=foo
2105   export VAR
2106
2107   parallel --env _ -S $SERVER1 'echo $VAR; my_func2' ::: bar
2108
2109 Output:
2110
2111   foo
2112   in my_func2 foo bar
2113
2114 If you use B<env_parallel> the variables, functions, and aliases do
2115 not even need to be exported to be copied:
2116
2117   NOT='not exported var'
2118   alias myecho=echo
2119   not_ex() {
2120     myecho in not_exported_func $NOT $1
2121   }
2122   env_parallel --env _ -S $SERVER1 'echo $NOT; not_ex' ::: bar
2123
2124 Output:
2125
2126   not exported var
2127   in not_exported_func not exported var bar
2128
2129
2130 =head2 Showing what is actually run
2131
2132 B<--verbose> will show the command that would be run on the local
2133 machine.
2134
2135 When using B<--cat>, B<--pipepart>, or when a job is run on a remote
2136 machine, the command is wrapped with helper scripts. B<-vv> shows all
2137 of this.
2138
2139   parallel -vv --pipepart --block 1M wc :::: num30000
2140
2141 Output:
2142
2143   <num30000 perl -e 'while(@ARGV) { sysseek(STDIN,shift,0) || die;
2144   $left = shift; while($read = sysread(STDIN,$buf, ($left > 131072
2145   ? 131072 : $left))){ $left -= $read; syswrite(STDOUT,$buf); } }'
2146   0 0 0 168894 | (wc)
2147     30000   30000  168894
2148
2149 When the command gets more complex, the output is so hard to read,
2150 that it is only useful for debugging:
2151
2152   my_func3() {
2153     echo in my_func $1 > $1.out
2154   }
2155   export -f my_func3
2156   parallel -vv --workdir ... --nice 17 --env _ --trc {}.out \
2157     -S $SERVER1 my_func3 {} ::: abc-file
2158
2159 Output will be similar to:
2160
2161
2162   ( ssh server -- mkdir -p ./.parallel/tmp/aspire-1928520-1;rsync
2163   --protocol 30 -rlDzR -essh ./abc-file
2164   server:./.parallel/tmp/aspire-1928520-1 );ssh server -- exec perl -e
2165   \''@GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
2166   eval"@GNU_Parallel";my$eval=decode_base64(join"",@ARGV);eval$eval;'\'
2167   c3lzdGVtKCJta2RpciIsIi1wIiwiLS0iLCIucGFyYWxsZWwvdG1wL2FzcGlyZS0xOTI4N
2168   TsgY2hkaXIgIi5wYXJhbGxlbC90bXAvYXNwaXJlLTE5Mjg1MjAtMSIgfHxwcmludChTVE
2169   BhcmFsbGVsOiBDYW5ub3QgY2hkaXIgdG8gLnBhcmFsbGVsL3RtcC9hc3BpcmUtMTkyODU
2170   iKSAmJiBleGl0IDI1NTskRU5WeyJPTERQV0QifT0iL2hvbWUvdGFuZ2UvcHJpdmF0L3Bh
2171   IjskRU5WeyJQQVJBTExFTF9QSUQifT0iMTkyODUyMCI7JEVOVnsiUEFSQUxMRUxfU0VRI
2172   0BiYXNoX2Z1bmN0aW9ucz1xdyhteV9mdW5jMyk7IGlmKCRFTlZ7IlNIRUxMIn09fi9jc2
2173   ByaW50IFNUREVSUiAiQ1NIL1RDU0ggRE8gTk9UIFNVUFBPUlQgbmV3bGluZXMgSU4gVkF
2174   TL0ZVTkNUSU9OUy4gVW5zZXQgQGJhc2hfZnVuY3Rpb25zXG4iOyBleGVjICJmYWxzZSI7
2175   YXNoZnVuYyA9ICJteV9mdW5jMygpIHsgIGVjaG8gaW4gbXlfZnVuYyBcJDEgPiBcJDEub
2176   Xhwb3J0IC1mIG15X2Z1bmMzID4vZGV2L251bGw7IjtAQVJHVj0ibXlfZnVuYzMgYWJjLW
2177   RzaGVsbD0iJEVOVntTSEVMTH0iOyR0bXBkaXI9Ii90bXAiOyRuaWNlPTE3O2RveyRFTlZ
2178   MRUxfVE1QfT0kdG1wZGlyLiIvcGFyIi5qb2luIiIsbWFweygwLi45LCJhIi4uInoiLCJB
2179   KVtyYW5kKDYyKV19KDEuLjUpO313aGlsZSgtZSRFTlZ7UEFSQUxMRUxfVE1QfSk7JFNJ
2180   fT1zdWJ7JGRvbmU9MTt9OyRwaWQ9Zm9yazt1bmxlc3MoJHBpZCl7c2V0cGdycDtldmFse
2181   W9yaXR5KDAsMCwkbmljZSl9O2V4ZWMkc2hlbGwsIi1jIiwoJGJhc2hmdW5jLiJAQVJHVi
2182   JleGVjOiQhXG4iO31kb3skcz0kczwxPzAuMDAxKyRzKjEuMDM6JHM7c2VsZWN0KHVuZGV
2183   mLHVuZGVmLCRzKTt9dW50aWwoJGRvbmV8fGdldHBwaWQ9PTEpO2tpbGwoU0lHSFVQLC0k
2184   dW5sZXNzJGRvbmU7d2FpdDtleGl0KCQ/JjEyNz8xMjgrKCQ/JjEyNyk6MSskPz4+OCk=;
2185   _EXIT_status=$?; mkdir -p ./.; rsync --protocol 30 --rsync-path=cd\
2186   ./.parallel/tmp/aspire-1928520-1/./.\;\ rsync -rlDzR -essh
2187   server:./abc-file.out ./.;ssh server -- \(rm\ -f\
2188   ./.parallel/tmp/aspire-1928520-1/abc-file\;\ sh\ -c\ \'rmdir\
2189   ./.parallel/tmp/aspire-1928520-1/\ ./.parallel/tmp/\ ./.parallel/\
2190   2\>/dev/null\'\;rm\ -rf\ ./.parallel/tmp/aspire-1928520-1\;\);ssh
2191   server -- \(rm\ -f\ ./.parallel/tmp/aspire-1928520-1/abc-file.out\;\
2192   sh\ -c\ \'rmdir\ ./.parallel/tmp/aspire-1928520-1/\ ./.parallel/tmp/\
2193   ./.parallel/\ 2\>/dev/null\'\;rm\ -rf\
2194   ./.parallel/tmp/aspire-1928520-1\;\);ssh server -- rm -rf
2195   .parallel/tmp/aspire-1928520-1; exit $_EXIT_status;
2196
2197 =head1 Saving output to shell variables (advanced)
2198
2199 GNU B<parset> will set shell variables to the output of GNU
2200 B<parallel>. GNU B<parset> has one important limitation: It cannot be
2201 part of a pipe. In particular this means it cannot read anything from
2202 standard input (stdin) or pipe output to another program.
2203
2204 To use GNU B<parset> prepend command with destination variables:
2205
2206   parset myvar1,myvar2 echo ::: a b
2207   echo $myvar1
2208   echo $myvar2
2209
2210 Output:
2211
2212   a
2213   b
2214
2215 If you only give a single variable, it will be treated as an array:
2216
2217   parset myarray seq {} 5 ::: 1 2 3
2218   echo "${myarray[1]}"
2219
2220 Output:
2221
2222   2
2223   3
2224   4
2225   5
2226
2227 The commands to run can be an array:
2228
2229   cmd=("echo '<<joe  \"double  space\"  cartoon>>'" "pwd")
2230   parset data ::: "${cmd[@]}"
2231   echo "${data[0]}"
2232   echo "${data[1]}"
2233
2234 Output:
2235
2236   <<joe  "double  space"  cartoon>>
2237   [current dir]
2238
2239
2240 =head1 Saving to an SQL base (advanced)
2241
2242 GNU B<parallel> can save into an SQL base. Point GNU B<parallel> to a
2243 table and it will put the joblog there together with the variables and
2244 the output each in their own column.
2245
2246 =head2 CSV as SQL base
2247
2248 The simplest is to use a CSV file as the storage table:
2249
2250   parallel --sqlandworker csv:///%2Ftmp/log.csv \
2251     seq ::: 10 ::: 12 13 14
2252   cat /tmp/log.csv
2253
2254 Note how '/' in the path must be written as %2F.
2255
2256 Output will be similar to:
2257
2258   Seq,Host,Starttime,JobRuntime,Send,Receive,Exitval,_Signal,
2259     Command,V1,V2,Stdout,Stderr
2260   1,:,1458254498.254,0.069,0,9,0,0,"seq 10 12",10,12,"10
2261   11
2262   12
2263   ",
2264   2,:,1458254498.278,0.080,0,12,0,0,"seq 10 13",10,13,"10
2265   11
2266   12
2267   13
2268   ",
2269   3,:,1458254498.301,0.083,0,15,0,0,"seq 10 14",10,14,"10
2270   11
2271   12
2272   13
2273   14
2274   ",
2275
2276 A proper CSV reader (like LibreOffice or R's read.csv) will read this
2277 format correctly - even with fields containing newlines as above.
2278
2279 If the output is big you may want to put it into files using B<--results>:
2280
2281   parallel --results outdir --sqlandworker csv:///%2Ftmp/log2.csv \
2282     seq ::: 10 ::: 12 13 14
2283   cat /tmp/log2.csv
2284
2285 Output will be similar to:
2286
2287   Seq,Host,Starttime,JobRuntime,Send,Receive,Exitval,_Signal,
2288     Command,V1,V2,Stdout,Stderr
2289   1,:,1458824738.287,0.029,0,9,0,0,
2290     "seq 10 12",10,12,outdir/1/10/2/12/stdout,outdir/1/10/2/12/stderr
2291   2,:,1458824738.298,0.025,0,12,0,0,
2292     "seq 10 13",10,13,outdir/1/10/2/13/stdout,outdir/1/10/2/13/stderr
2293   3,:,1458824738.309,0.026,0,15,0,0,
2294     "seq 10 14",10,14,outdir/1/10/2/14/stdout,outdir/1/10/2/14/stderr
2295
2296
2297 =head2 DBURL as table
2298
2299 The CSV file is an example of a DBURL.
2300
2301 GNU B<parallel> uses a DBURL to address the table. A DBURL has this format:
2302
2303   vendor://[[user][:password]@][host][:port]/[database[/table]
2304
2305 Example:
2306
2307   mysql://scott:tiger@my.example.com/mydatabase/mytable
2308   postgresql://scott:tiger@pg.example.com/mydatabase/mytable
2309   sqlite3:///%2Ftmp%2Fmydatabase/mytable
2310   csv:///%2Ftmp/log.csv
2311
2312 To refer to B</tmp/mydatabase> with B<sqlite> or B<csv> you need to
2313 encode the B</> as B<%2F>.
2314
2315 Run a job using B<sqlite> on B<mytable> in B</tmp/mydatabase>:
2316
2317   DBURL=sqlite3:///%2Ftmp%2Fmydatabase
2318   DBURLTABLE=$DBURL/mytable
2319   parallel --sqlandworker $DBURLTABLE echo ::: foo bar ::: baz quuz
2320
2321 To see the result:
2322
2323   sql $DBURL 'SELECT * FROM mytable ORDER BY Seq;'
2324
2325 Output will be similar to:
2326
2327   Seq|Host|Starttime|JobRuntime|Send|Receive|Exitval|_Signal|
2328     Command|V1|V2|Stdout|Stderr
2329   1|:|1451619638.903|0.806||8|0|0|echo foo baz|foo|baz|foo baz
2330   |
2331   2|:|1451619639.265|1.54||9|0|0|echo foo quuz|foo|quuz|foo quuz
2332   |
2333   3|:|1451619640.378|1.43||8|0|0|echo bar baz|bar|baz|bar baz
2334   |
2335   4|:|1451619641.473|0.958||9|0|0|echo bar quuz|bar|quuz|bar quuz
2336   |
2337
2338 The first columns are well known from B<--joblog>. B<V1> and B<V2> are
2339 data from the input sources. B<Stdout> and B<Stderr> are standard
2340 output and standard error, respectively.
2341
2342 =head2 Using multiple workers
2343
2344 Using an SQL base as storage costs overhead in the order of 1 second
2345 per job.
2346
2347 One of the situations where it makes sense is if you have multiple
2348 workers.
2349
2350 You can then have a single master machine that submits jobs to the SQL
2351 base (but does not do any of the work):
2352
2353   parallel --sqlmaster $DBURLTABLE echo ::: foo bar ::: baz quuz
2354
2355 On the worker machines you run exactly the same command except you
2356 replace B<--sqlmaster> with B<--sqlworker>.
2357
2358   parallel --sqlworker $DBURLTABLE echo ::: foo bar ::: baz quuz
2359
2360 To run a master and a worker on the same machine use B<--sqlandworker>
2361 as shown earlier.
2362
2363
2364 =head1 --pipe
2365
2366 The B<--pipe> functionality puts GNU B<parallel> in a different mode:
2367 Instead of treating the data on stdin (standard input) as arguments
2368 for a command to run, the data will be sent to stdin (standard input)
2369 of the command.
2370
2371 The typical situation is:
2372
2373   command_A | command_B | command_C
2374
2375 where command_B is slow, and you want to speed up command_B.
2376
2377 =head2 Chunk size
2378
2379 By default GNU B<parallel> will start an instance of command_B, read a
2380 chunk of 1 MB, and pass that to the instance. Then start another
2381 instance, read another chunk, and pass that to the second instance.
2382
2383   cat num1000000 | parallel --pipe wc
2384
2385 Output (the order may be different):
2386
2387   165668  165668 1048571
2388   149797  149797 1048579
2389   149796  149796 1048572
2390   149797  149797 1048579
2391   149797  149797 1048579
2392   149796  149796 1048572
2393    85349   85349  597444
2394
2395 The size of the chunk is not exactly 1 MB because GNU B<parallel> only
2396 passes full lines - never half a line, thus the blocksize is only
2397 1 MB on average. You can change the block size to 2 MB with B<--block>:
2398
2399   cat num1000000 | parallel --pipe --block 2M wc
2400
2401 Output (the order may be different):
2402
2403   315465  315465 2097150
2404   299593  299593 2097151
2405   299593  299593 2097151
2406    85349   85349  597444
2407
2408 GNU B<parallel> treats each line as a record. If the order of records
2409 is unimportant (e.g. you need all lines processed, but you do not care
2410 which is processed first), then you can use B<--roundrobin>. Without
2411 B<--roundrobin> GNU B<parallel> will start a command per block; with
2412 B<--roundrobin> only the requested number of jobs will be started
2413 (B<--jobs>). The records will then be distributed between the running
2414 jobs:
2415
2416   cat num1000000 | parallel --pipe -j4 --roundrobin wc
2417
2418 Output will be similar to:
2419
2420   149797  149797 1048579
2421   299593  299593 2097151
2422   315465  315465 2097150
2423   235145  235145 1646016
2424
2425 One of the 4 instances got a single record, 2 instances got 2 full
2426 records each, and one instance got 1 full and 1 partial record.
2427
2428 =head2 Records
2429
2430 GNU B<parallel> sees the input as records. The default record is a single
2431 line.
2432
2433 Using B<-N140000> GNU B<parallel> will read 140000 records at a time:
2434
2435   cat num1000000 | parallel --pipe -N140000 wc
2436
2437 Output (the order may be different):
2438
2439   140000  140000  868895
2440   140000  140000  980000
2441   140000  140000  980000
2442   140000  140000  980000
2443   140000  140000  980000
2444   140000  140000  980000
2445   140000  140000  980000
2446    20000   20000  140001
2447
2448 Note how that the last job could not get the full 140000 lines, but
2449 only 20000 lines.
2450
2451 If a record is 75 lines B<-L> can be used:
2452
2453   cat num1000000 | parallel --pipe -L75 wc
2454
2455 Output (the order may be different):
2456
2457   165600  165600 1048095
2458   149850  149850 1048950
2459   149775  149775 1048425
2460   149775  149775 1048425
2461   149850  149850 1048950
2462   149775  149775 1048425
2463    85350   85350  597450
2464       25      25     176
2465
2466 Note how GNU B<parallel> still reads a block of around 1 MB; but
2467 instead of passing full lines to B<wc> it passes full 75 lines at a
2468 time. This of course does not hold for the last job (which in this
2469 case got 25 lines).
2470
2471 =head2 Fixed length records
2472
2473 Fixed length records can be processed by setting B<--recend ''> and
2474 B<--block I<recordsize>>. A header of size I<n> can be processed with
2475 B<--header .{I<n>}>.
2476
2477 Here is how to process a file with a 4-byte header and a 3-byte record
2478 size:
2479
2480   cat fixedlen | parallel --pipe --header .{4} --block 3 --recend '' \
2481     'echo start; cat; echo'
2482
2483 Output:
2484
2485   start
2486   HHHHAAA
2487   start
2488   HHHHCCC
2489   start
2490   HHHHBBB
2491
2492 It may be more efficient to increase B<--block> to a multiplum of the
2493 record size.
2494
2495 =head2 Record separators
2496
2497 GNU B<parallel> uses separators to determine where two records split.
2498
2499 B<--recstart> gives the string that starts a record; B<--recend> gives the
2500 string that ends a record. The default is B<--recend '\n'> (newline).
2501
2502 If both B<--recend> and B<--recstart> are given, then the record will only
2503 split if the recend string is immediately followed by the recstart
2504 string.
2505
2506 Here the B<--recend> is set to B<', '>:
2507
2508   echo /foo, bar/, /baz, qux/, | \
2509     parallel -kN1 --recend ', ' --pipe echo JOB{#}\;cat\;echo END
2510
2511 Output:
2512
2513   JOB1
2514   /foo, END
2515   JOB2
2516   bar/, END
2517   JOB3
2518   /baz, END
2519   JOB4
2520   qux/,
2521   END
2522
2523 Here the B<--recstart> is set to B</>:
2524
2525   echo /foo, bar/, /baz, qux/, | \
2526     parallel -kN1 --recstart / --pipe echo JOB{#}\;cat\;echo END
2527
2528 Output:
2529
2530   JOB1
2531   /foo, barEND
2532   JOB2
2533   /, END
2534   JOB3
2535   /baz, quxEND
2536   JOB4
2537   /,
2538   END
2539
2540 Here both B<--recend> and B<--recstart> are set:
2541
2542   echo /foo, bar/, /baz, qux/, | \
2543     parallel -kN1 --recend ', ' --recstart / --pipe \
2544     echo JOB{#}\;cat\;echo END
2545
2546 Output:
2547
2548   JOB1
2549   /foo, bar/, END
2550   JOB2
2551   /baz, qux/,
2552   END
2553
2554 Note the difference between setting one string and setting both strings.
2555
2556 With B<--regexp> the B<--recend> and B<--recstart> will be treated as
2557 a regular expression:
2558
2559   echo foo,bar,_baz,__qux, | \
2560     parallel -kN1 --regexp --recend ,_+ --pipe \
2561     echo JOB{#}\;cat\;echo END
2562
2563 Output:
2564
2565   JOB1
2566   foo,bar,_END
2567   JOB2
2568   baz,__END
2569   JOB3
2570   qux,
2571   END
2572
2573 GNU B<parallel> can remove the record separators with
2574 B<--remove-rec-sep>/B<--rrs>:
2575
2576   echo foo,bar,_baz,__qux, | \
2577     parallel -kN1 --rrs --regexp --recend ,_+ --pipe \
2578     echo JOB{#}\;cat\;echo END
2579
2580 Output:
2581
2582   JOB1
2583   foo,barEND
2584   JOB2
2585   bazEND
2586   JOB3
2587   qux,
2588   END
2589
2590 =head2 Header
2591
2592 If the input data has a header, the header can be repeated for each
2593 job by matching the header with B<--header>. If headers start with
2594 B<%> you can do this:
2595
2596   cat num_%header | \
2597     parallel --header '(%.*\n)*' --pipe -N3 echo JOB{#}\;cat
2598
2599 Output (the order may be different):
2600
2601   JOB1
2602   %head1
2603   %head2
2604   1
2605   2
2606   3
2607   JOB2
2608   %head1
2609   %head2
2610   4
2611   5
2612   6
2613   JOB3
2614   %head1
2615   %head2
2616   7
2617   8
2618   9
2619   JOB4
2620   %head1
2621   %head2
2622   10
2623
2624 If the header is 2 lines, B<--header> 2 will work:
2625
2626   cat num_%header | parallel --header 2 --pipe -N3 echo JOB{#}\;cat
2627
2628 Output: Same as above.
2629
2630 =head2 --pipepart
2631
2632 B<--pipe> is not very efficient. It maxes out at around 500
2633 MB/s. B<--pipepart> can easily deliver 5 GB/s. But there are a few
2634 limitations. The input has to be a normal file (not a pipe) given by
2635 B<-a> or B<::::> and B<-L>/B<-l>/B<-N> do not work. B<--recend> and
2636 B<--recstart>, however, I<do> work, and records can often be split on
2637 that alone.
2638
2639   parallel --pipepart -a num1000000 --block 3m wc
2640
2641 Output (the order may be different):
2642
2643  444443  444444 3000002
2644  428572  428572 3000004
2645  126985  126984  888890
2646
2647
2648 =head1 Shebang
2649
2650 =head2 Input data and parallel command in the same file
2651
2652 GNU B<parallel> is often called as this:
2653
2654   cat input_file | parallel command
2655
2656 With B<--shebang> the I<input_file> and B<parallel> can be combined into the same script.
2657
2658 UNIX shell scripts start with a shebang line like this:
2659
2660   #!/bin/bash
2661
2662 GNU B<parallel> can do that, too. With B<--shebang> the arguments can be
2663 listed in the file. The B<parallel> command is the first line of the
2664 script:
2665
2666   #!/usr/bin/parallel --shebang -r echo
2667
2668   foo
2669   bar
2670   baz
2671
2672 Output (the order may be different):
2673
2674   foo
2675   bar
2676   baz
2677
2678 =head2 Parallelizing existing scripts
2679
2680 GNU B<parallel> is often called as this:
2681
2682   cat input_file | parallel command
2683   parallel command ::: foo bar
2684
2685 If B<command> is a script, B<parallel> can be combined into a single
2686 file so this will run the script in parallel:
2687
2688   cat input_file | command
2689   command foo bar
2690
2691 This B<perl> script B<perl_echo> works like B<echo>:
2692
2693   #!/usr/bin/perl
2694
2695   print "@ARGV\n"
2696
2697 It can be called as this:
2698
2699   parallel perl_echo ::: foo bar
2700
2701 By changing the B<#!>-line it can be run in parallel:
2702
2703   #!/usr/bin/parallel --shebang-wrap /usr/bin/perl
2704
2705   print "@ARGV\n"
2706
2707 Thus this will work:
2708
2709   perl_echo foo bar
2710
2711 Output (the order may be different):
2712
2713   foo
2714   bar
2715
2716 This technique can be used for:
2717
2718 =over 9
2719
2720 =item Perl:
2721
2722   #!/usr/bin/parallel --shebang-wrap /usr/bin/perl
2723
2724   print "Arguments @ARGV\n";
2725
2726
2727 =item Python:
2728
2729   #!/usr/bin/parallel --shebang-wrap /usr/bin/python
2730
2731   import sys
2732   print 'Arguments', str(sys.argv)
2733
2734
2735 =item Bash/sh/zsh/Korn shell:
2736
2737   #!/usr/bin/parallel --shebang-wrap /bin/bash
2738
2739   echo Arguments "$@"
2740
2741
2742 =item csh:
2743
2744   #!/usr/bin/parallel --shebang-wrap /bin/csh
2745
2746   echo Arguments "$argv"
2747
2748
2749 =item Tcl:
2750
2751   #!/usr/bin/parallel --shebang-wrap /usr/bin/tclsh
2752
2753   puts "Arguments $argv"
2754
2755
2756 =item R:
2757
2758   #!/usr/bin/parallel --shebang-wrap /usr/bin/Rscript --vanilla --slave
2759
2760   args <- commandArgs(trailingOnly = TRUE)
2761   print(paste("Arguments ",args))
2762
2763
2764 =item GNUplot:
2765
2766   #!/usr/bin/parallel --shebang-wrap ARG={} /usr/bin/gnuplot
2767
2768   print "Arguments ", system('echo $ARG')
2769
2770
2771 =item Ruby:
2772
2773   #!/usr/bin/parallel --shebang-wrap /usr/bin/ruby
2774
2775   print "Arguments "
2776   puts ARGV
2777
2778
2779 =item Octave:
2780
2781   #!/usr/bin/parallel --shebang-wrap /usr/bin/octave
2782
2783   printf ("Arguments");
2784   arg_list = argv ();
2785   for i = 1:nargin
2786     printf (" %s", arg_list{i});
2787   endfor
2788   printf ("\n");
2789
2790 =item Common LISP:
2791
2792   #!/usr/bin/parallel --shebang-wrap /usr/bin/clisp
2793
2794   (format t "~&~S~&" 'Arguments)
2795   (format t "~&~S~&" *args*)
2796
2797 =item PHP:
2798
2799   #!/usr/bin/parallel --shebang-wrap /usr/bin/php
2800   <?php
2801   echo "Arguments";
2802   foreach(array_slice($argv,1) as $v)
2803   {
2804     echo " $v";
2805   }
2806   echo "\n";
2807   ?>
2808
2809 =item Node.js:
2810
2811   #!/usr/bin/parallel --shebang-wrap /usr/bin/node
2812
2813   var myArgs = process.argv.slice(2);
2814   console.log('Arguments ', myArgs);
2815
2816 =item LUA:
2817
2818   #!/usr/bin/parallel --shebang-wrap /usr/bin/lua
2819
2820   io.write "Arguments"
2821   for a = 1, #arg do
2822     io.write(" ")
2823     io.write(arg[a])
2824   end
2825   print("")
2826
2827 =item C#:
2828
2829   #!/usr/bin/parallel --shebang-wrap ARGV={} /usr/bin/csharp
2830
2831   var argv = Environment.GetEnvironmentVariable("ARGV");
2832   print("Arguments "+argv);
2833
2834 =back
2835
2836 =head1 Semaphore
2837
2838 GNU B<parallel> can work as a counting semaphore. This is slower and less
2839 efficient than its normal mode.
2840
2841 A counting semaphore is like a row of toilets. People needing a toilet
2842 can use any toilet, but if there are more people than toilets, they
2843 will have to wait for one of the toilets to become available.
2844
2845 An alias for B<parallel --semaphore> is B<sem>.
2846
2847 B<sem> will follow a person to the toilets, wait until a toilet is
2848 available, leave the person in the toilet and exit.
2849
2850 B<sem --fg> will follow a person to the toilets, wait until a toilet is
2851 available, stay with the person in the toilet and exit when the person
2852 exits.
2853
2854 B<sem --wait> will wait for all persons to leave the toilets.
2855
2856 B<sem> does not have a queue discipline, so the next person is chosen
2857 randomly.
2858
2859 B<-j> sets the number of toilets.
2860
2861 =head2 Mutex
2862
2863 The default is to have only one toilet (this is called a mutex). The
2864 program is started in the background and B<sem> exits immediately. Use
2865 B<--wait> to wait for all B<sem>s to finish:
2866
2867   sem 'sleep 1; echo The first finished' &&
2868     echo The first is now running in the background &&
2869     sem 'sleep 1; echo The second finished' &&
2870     echo The second is now running in the background
2871   sem --wait
2872
2873 Output:
2874
2875   The first is now running in the background
2876   The first finished
2877   The second is now running in the background
2878   The second finished
2879
2880 The command can be run in the foreground with B<--fg>, which will only
2881 exit when the command completes:
2882
2883   sem --fg 'sleep 1; echo The first finished' &&
2884     echo The first finished running in the foreground &&
2885     sem --fg 'sleep 1; echo The second finished' &&
2886     echo The second finished running in the foreground
2887   sem --wait
2888
2889 The difference between this and just running the command, is that a
2890 mutex is set, so if other B<sem>s were running in the background only one
2891 would run at a time.
2892
2893 To control which semaphore is used, use
2894 B<--semaphorename>/B<--id>. Run this in one terminal:
2895
2896   sem --id my_id -u 'echo First started; sleep 10; echo First done'
2897
2898 and simultaneously this in another terminal:
2899
2900   sem --id my_id -u 'echo Second started; sleep 10; echo Second done'
2901
2902 Note how the second will only be started when the first has finished.
2903
2904 =head2 Counting semaphore
2905
2906 A mutex is like having a single toilet: When it is in use everyone
2907 else will have to wait. A counting semaphore is like having multiple
2908 toilets: Several people can use the toilets, but when they all are in
2909 use, everyone else will have to wait.
2910
2911 B<sem> can emulate a counting semaphore. Use B<--jobs> to set the
2912 number of toilets like this:
2913
2914   sem --jobs 3 --id my_id -u 'echo Start 1; sleep 5; echo 1 done' &&
2915   sem --jobs 3 --id my_id -u 'echo Start 2; sleep 6; echo 2 done' &&
2916   sem --jobs 3 --id my_id -u 'echo Start 3; sleep 7; echo 3 done' &&
2917   sem --jobs 3 --id my_id -u 'echo Start 4; sleep 8; echo 4 done' &&
2918   sem --wait --id my_id
2919
2920 Output:
2921
2922   Start 1
2923   Start 2
2924   Start 3
2925   1 done
2926   Start 4
2927   2 done
2928   3 done
2929   4 done
2930
2931 =head2 Timeout
2932
2933 With B<--semaphoretimeout> you can force running the command anyway after
2934 a period (positive number) or give up (negative number):
2935
2936   sem --id foo -u 'echo Slow started; sleep 5; echo Slow ended' &&
2937   sem --id foo --semaphoretimeout 1 'echo Forced running after 1 sec' &&
2938   sem --id foo --semaphoretimeout -2 'echo Give up after 2 secs'
2939   sem --id foo --wait
2940
2941 Output:
2942
2943   Slow started
2944   parallel: Warning: Semaphore timed out. Stealing the semaphore.
2945   Forced running after 1 sec
2946   parallel: Warning: Semaphore timed out. Exiting.
2947   Slow ended
2948
2949 Note how the 'Give up' was not run.
2950
2951 =head1 Informational
2952
2953 GNU B<parallel> has some options to give short information about the
2954 configuration.
2955
2956 B<--help> will print a summary of the most important options:
2957
2958   parallel --help
2959
2960 Output:
2961
2962   Usage:
2963
2964   parallel [options] [command [arguments]] < list_of_arguments
2965   parallel [options] [command [arguments]] (::: arguments|:::: argfile(s))...
2966   cat ... | parallel --pipe [options] [command [arguments]]
2967
2968   -j n            Run n jobs in parallel
2969   -k              Keep same order
2970   -X              Multiple arguments with context replace
2971   --colsep regexp Split input on regexp for positional replacements
2972   {} {.} {/} {/.} {#} {%} {= perl code =} Replacement strings
2973   {3} {3.} {3/} {3/.} {=3 perl code =}    Positional replacement strings
2974   With --plus:    {} = {+/}/{/} = {.}.{+.} = {+/}/{/.}.{+.} = {..}.{+..} =
2975                   {+/}/{/..}.{+..} = {...}.{+...} = {+/}/{/...}.{+...}
2976
2977   -S sshlogin     Example: foo@server.example.com
2978   --slf ..        Use ~/.parallel/sshloginfile as the list of sshlogins
2979   --trc {}.bar    Shorthand for --transfer --return {}.bar --cleanup
2980   --onall         Run the given command with argument on all sshlogins
2981   --nonall        Run the given command with no arguments on all sshlogins
2982
2983   --pipe          Split stdin (standard input) to multiple jobs.
2984   --recend str    Record end separator for --pipe.
2985   --recstart str  Record start separator for --pipe.
2986
2987   See 'man parallel' for details
2988
2989   Academic tradition requires you to cite works you base your article on.
2990   When using programs that use GNU Parallel to process data for publication
2991   please cite:
2992
2993     O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
2994     ;login: The USENIX Magazine, February 2011:42-47.
2995
2996   This helps funding further development; AND IT WON'T COST YOU A CENT.
2997   If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
2998
2999 When asking for help, always report the full output of this:
3000
3001   parallel --version
3002
3003 Output:
3004
3005   GNU parallel 20230122
3006   Copyright (C) 2007-2024 Ole Tange, http://ole.tange.dk and Free Software
3007   Foundation, Inc.
3008   License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
3009   This is free software: you are free to change and redistribute it.
3010   GNU parallel comes with no warranty.
3011
3012   Web site: https://www.gnu.org/software/parallel
3013
3014   When using programs that use GNU Parallel to process data for publication
3015   please cite as described in 'parallel --citation'.
3016
3017 In scripts B<--minversion> can be used to ensure the user has at least
3018 this version:
3019
3020   parallel --minversion 20130722 && \
3021     echo Your version is at least 20130722.
3022
3023 Output:
3024
3025   20160322
3026   Your version is at least 20130722.
3027
3028 If you are using GNU B<parallel> for research the BibTeX citation can be
3029 generated using B<--citation>:
3030
3031   parallel --citation
3032
3033 Output:
3034
3035   Academic tradition requires you to cite works you base your article on.
3036   When using programs that use GNU Parallel to process data for publication
3037   please cite:
3038
3039   @article{Tange2011a,
3040     title = {GNU Parallel - The Command-Line Power Tool},
3041     author = {O. Tange},
3042     address = {Frederiksberg, Denmark},
3043     journal = {;login: The USENIX Magazine},
3044     month = {Feb},
3045     number = {1},
3046     volume = {36},
3047     url = {https://www.gnu.org/s/parallel},
3048     year = {2011},
3049     pages = {42-47},
3050     doi = {10.5281/zenodo.16303}
3051   }
3052
3053   (Feel free to use \nocite{Tange2011a})
3054
3055   This helps funding further development; AND IT WON'T COST YOU A CENT.
3056   If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
3057
3058   If you send a copy of your published article to tange@gnu.org, it will be
3059   mentioned in the release notes of next version of GNU Parallel.
3060
3061 With B<--max-line-length-allowed> GNU B<parallel> will report the maximal
3062 size of the command line:
3063
3064   parallel --max-line-length-allowed
3065
3066 Output (may vary on different systems):
3067
3068   131071
3069
3070 B<--number-of-cpus> and B<--number-of-cores> run system specific code to
3071 determine the number of CPUs and CPU cores on the system. On
3072 unsupported platforms they will return 1:
3073
3074   parallel --number-of-cpus
3075   parallel --number-of-cores
3076
3077 Output (may vary on different systems):
3078
3079   4
3080   64
3081
3082 =head1 Profiles
3083
3084 The defaults for GNU B<parallel> can be changed systemwide by putting the
3085 command line options in B</etc/parallel/config>. They can be changed for
3086 a user by putting them in B<~/.parallel/config>.
3087
3088 Profiles work the same way, but have to be referred to with B<--profile>:
3089
3090   echo '--nice 17' > ~/.parallel/nicetimeout
3091   echo '--timeout 300%' >> ~/.parallel/nicetimeout
3092   parallel --profile nicetimeout echo ::: A B C
3093
3094 Output:
3095
3096   A
3097   B
3098   C
3099
3100 Profiles can be combined:
3101
3102   echo '-vv --dry-run' > ~/.parallel/dryverbose
3103   parallel --profile dryverbose --profile nicetimeout echo ::: A B C
3104
3105 Output:
3106
3107   echo A
3108   echo B
3109   echo C
3110
3111
3112 =head1 Spread the word
3113
3114 I hope you have learned something from this tutorial.
3115
3116 If you like GNU B<parallel>:
3117
3118 =over 2
3119
3120 =item *
3121
3122 (Re-)walk through the tutorial if you have not done so in the past year
3123 (https://www.gnu.org/software/parallel/parallel_tutorial.html)
3124
3125 =item *
3126
3127 Give a demo at your local user group/your team/your colleagues
3128
3129 =item *
3130
3131 Post the intro videos and the tutorial on Reddit, Mastodon, Diaspora*,
3132 forums, blogs, Identi.ca, Google+, Twitter, Facebook, Linkedin, and
3133 mailing lists
3134
3135 =item *
3136
3137 Request or write a review for your favourite blog or magazine
3138 (especially if you do something cool with GNU B<parallel>)
3139
3140 =item *
3141
3142 Invite me for your next conference
3143
3144 =back
3145
3146 If you use GNU B<parallel> for research:
3147
3148 =over 2
3149
3150 =item *
3151
3152 Please cite GNU B<parallel> in you publications (use B<--citation>)
3153
3154 =back
3155
3156 If GNU B<parallel> saves you money:
3157
3158 =over 2
3159
3160 =item *
3161
3162 (Have your company) donate to FSF or become a member
3163 https://my.fsf.org/donate/
3164
3165 =back
3166
3167 (C) 2013-2024 Ole Tange, GFDLv1.3+ (See
3168 LICENSES/GFDL-1.3-or-later.txt)
3169
3170
3171 =cut