src/parallel_alternatives.pod

   1 #!/usr/bin/perl -w
   2
   3 # SPDX-FileCopyrightText: 2021-2023 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
   4 # SPDX-License-Identifier: GFDL-1.3-or-later
   5 # SPDX-License-Identifier: CC-BY-SA-4.0
   6
   7 =encoding utf8
   8
   9 =head1 NAME
  10
  11 parallel_alternatives - Alternatives to GNU B<parallel>
  12
  13
  14 =head1 DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES
  15
  16 There are a lot programs that share functionality with GNU
  17 B<parallel>. Some of these are specialized tools, and while GNU
  18 B<parallel> can emulate many of them, a specialized tool can be better
  19 at a given task. GNU B<parallel> strives to include the best of the
  20 general functionality without sacrificing ease of use.
  21
  22 B<parallel> has existed since 2002-01-06 and as GNU B<parallel> since
  23 2010. A lot of the alternatives have not had the vitality to survive
  24 that long, but have come and gone during that time.
  25
  26 GNU B<parallel> is actively maintained with a new release every month
  27 since 2010. Most other alternatives are fleeting interests of the
  28 developers with irregular releases and only maintained for a few
  29 years.
  30
  31
  32 =head2 SUMMARY LEGEND
  33
  34 The following features are in some of the comparable tools:
  35
  36 =head3 Inputs
  37
  38 =over
  39
  40 =item I1. Arguments can be read from stdin
  41
  42 =item I2. Arguments can be read from a file
  43
  44 =item I3. Arguments can be read from multiple files
  45
  46 =item I4. Arguments can be read from command line
  47
  48 =item I5. Arguments can be read from a table
  49
  50 =item I6. Arguments can be read from the same file using #! (shebang)
  51
  52 =item I7. Line oriented input as default (Quoting of special chars not needed)
  53
  54 =back
  55
  56
  57 =head3 Manipulation of input
  58
  59 =over
  60
  61 =item M1. Composed command
  62
  63 =item M2. Multiple arguments can fill up an execution line
  64
  65 =item M3. Arguments can be put anywhere in the execution line
  66
  67 =item M4. Multiple arguments can be put anywhere in the execution line
  68
  69 =item M5. Arguments can be replaced with context
  70
  71 =item M6. Input can be treated as the complete command line
  72
  73 =back
  74
  75
  76 =head3 Outputs
  77
  78 =over
  79
  80 =item O1. Grouping output so output from different jobs do not mix
  81
  82 =item O2. Send stderr (standard error) to stderr (standard error)
  83
  84 =item O3. Send stdout (standard output) to stdout (standard output)
  85
  86 =item O4. Order of output can be same as order of input
  87
  88 =item O5. Stdout only contains stdout (standard output) from the command
  89
  90 =item O6. Stderr only contains stderr (standard error) from the command
  91
  92 =item O7. Buffering on disk
  93
  94 =item O8. No temporary files left if killed
  95
  96 =item O9. Test if disk runs full during run
  97
  98 =item O10. Output of a line bigger than 4 GB
  99
 100 =back
 101
 102
 103 =head3 Execution
 104
 105 =over
 106
 107 =item E1. Run jobs in parallel
 108
 109 =item E2. List running jobs
 110
 111 =item E3. Finish running jobs, but do not start new jobs
 112
 113 =item E4. Number of running jobs can depend on number of cpus
 114
 115 =item E5. Finish running jobs, but do not start new jobs after first failure
 116
 117 =item E6. Number of running jobs can be adjusted while running
 118
 119 =item E7. Only spawn new jobs if load is less than a limit
 120
 121 =back
 122
 123
 124 =head3 Remote execution
 125
 126 =over
 127
 128 =item R1. Jobs can be run on remote computers
 129
 130 =item R2. Basefiles can be transferred
 131
 132 =item R3. Argument files can be transferred
 133
 134 =item R4. Result files can be transferred
 135
 136 =item R5. Cleanup of transferred files
 137
 138 =item R6. No config files needed
 139
 140 =item R7. Do not run more than SSHD's MaxStartups can handle
 141
 142 =item R8. Configurable SSH command
 143
 144 =item R9. Retry if connection breaks occasionally
 145
 146 =back
 147
 148
 149 =head3 Semaphore
 150
 151 =over
 152
 153 =item S1. Possibility to work as a mutex
 154
 155 =item S2. Possibility to work as a counting semaphore
 156
 157 =back
 158
 159
 160 =head3 Legend
 161
 162 =over
 163
 164 =item - = no
 165
 166 =item x = not applicable
 167
 168 =item ID = yes
 169
 170 =back
 171
 172 As every new version of the programs are not tested the table may be
 173 outdated. Please file a bug report if you find errors (See REPORTING
 174 BUGS).
 175
 176 parallel:
 177
 178 =over
 179
 180 =item I1 I2 I3 I4 I5 I6 I7
 181
 182 =item M1 M2 M3 M4 M5 M6
 183
 184 =item O1 O2 O3 O4 O5 O6 O7 O8 O9 O10
 185
 186 =item E1 E2 E3 E4 E5 E6 E7
 187
 188 =item R1 R2 R3 R4 R5 R6 R7 R8 R9
 189
 190 =item S1 S2
 191
 192 =back
 193
 194
 195 =head2 DIFFERENCES BETWEEN xargs AND GNU Parallel
 196
 197 Summary (see legend above):
 198
 199 =over
 200
 201 =item I1 I2 - - - - -
 202
 203 =item - M2 M3 - - -
 204
 205 =item - O2 O3 - O5 O6
 206
 207 =item E1 - - - - - -
 208
 209 =item - - - - - x - - -
 210
 211 =item - -
 212
 213 =back
 214
 215 B<xargs> offers some of the same possibilities as GNU B<parallel>.
 216
 217 B<xargs> deals badly with special characters (such as space, \, ' and
 218 "). To see the problem try this:
 219
 220   touch important_file
 221   touch 'not important_file'
 222   ls not* | xargs rm
 223   mkdir -p "My brother's 12\" records"
 224   ls | xargs rmdir
 225   touch 'c:\windows\system32\clfs.sys'
 226   echo 'c:\windows\system32\clfs.sys' | xargs ls -l
 227
 228 You can specify B<-0>, but many input generators are not optimized for
 229 using B<NUL> as separator but are optimized for B<newline> as
 230 separator. E.g. B<awk>, B<ls>, B<echo>, B<tar -v>, B<head> (requires
 231 using B<-z>), B<tail> (requires using B<-z>), B<sed> (requires using
 232 B<-z>), B<perl> (B<-0> and \0 instead of \n), B<locate> (requires
 233 using B<-0>), B<find> (requires using B<-print0>), B<grep> (requires
 234 using B<-z> or B<-Z>), B<sort> (requires using B<-z>).
 235
 236 GNU B<parallel>'s newline separation can be emulated with:
 237
 238   cat | xargs -d "\n" -n1 command
 239
 240 B<xargs> can run a given number of jobs in parallel, but has no
 241 support for running number-of-cpu-cores jobs in parallel.
 242
 243 B<xargs> has no support for grouping the output, therefore output may
 244 run together, e.g. the first half of a line is from one process and
 245 the last half of the line is from another process. The example
 246 B<Parallel grep> cannot be done reliably with B<xargs> because of
 247 this. To see this in action try:
 248
 249   parallel perl -e "'"'$a="1"."{}"x10000000;print $a,"\n"'"'" \
 250     '>' {} ::: a b c d e f g h
 251   # Serial = no mixing = the wanted result
 252   # 'tr -s a-z' squeezes repeating letters into a single letter
 253   echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
 254   # Compare to 8 jobs in parallel
 255   parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
 256   echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
 257   echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
 258     tr -s a-z
 259
 260 Or try this:
 261
 262   slow_seq() {
 263     echo Count to "$@"
 264     seq "$@" |
 265       perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
 266   }
 267   export -f slow_seq
 268   # Serial = no mixing = the wanted result
 269   seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
 270   # Compare to 8 jobs in parallel
 271   seq 8 | parallel -P8 slow_seq {}
 272   seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
 273
 274 B<xargs> has no support for keeping the order of the output, therefore
 275 if running jobs in parallel using B<xargs> the output of the second
 276 job cannot be postponed till the first job is done.
 277
 278 B<xargs> has no support for running jobs on remote computers.
 279
 280 B<xargs> has no support for context replace, so you will have to create the
 281 arguments.
 282
 283 If you use a replace string in B<xargs> (B<-I>) you can not force
 284 B<xargs> to use more than one argument.
 285
 286 Quoting in B<xargs> works like B<-q> in GNU B<parallel>. This means
 287 composed commands and redirection require using B<bash -c>.
 288
 289   ls | parallel "wc {} >{}.wc"
 290   ls | parallel "echo {}; ls {}|wc"
 291
 292 becomes (assuming you have 8 cores and that none of the filenames
 293 contain space, " or ').
 294
 295   ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
 296   ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
 297
 298 A more extreme example can be found on:
 299 https://unix.stackexchange.com/q/405552/
 300
 301 https://www.gnu.org/software/findutils/
 302
 303
 304 =head2 DIFFERENCES BETWEEN find -exec AND GNU Parallel
 305
 306 Summary (see legend above):
 307
 308 =over
 309
 310 =item -  -  -  x  -  x  -
 311
 312 =item -  M2 M3 -  -  -  -
 313
 314 =item -  O2 O3 O4 O5 O6
 315
 316 =item -  -  -  -  -  -  -
 317
 318 =item -  -  -  -  -  -  -  -  -
 319
 320 =item x  x
 321
 322 =back
 323
 324 B<find -exec> offers some of the same possibilities as GNU B<parallel>.
 325
 326 B<find -exec> only works on files. Processing other input (such as
 327 hosts or URLs) will require creating these inputs as files. B<find
 328 -exec> has no support for running commands in parallel.
 329
 330 https://www.gnu.org/software/findutils/
 331 (Last checked: 2019-01)
 332
 333
 334 =head2 DIFFERENCES BETWEEN make -j AND GNU Parallel
 335
 336 Summary (see legend above):
 337
 338 =over
 339
 340 =item -  -  -  -  -  -  -
 341
 342 =item -  -  -  -  -  -
 343
 344 =item O1 O2 O3 -  x  O6
 345
 346 =item E1 -  -  -  E5 -
 347
 348 =item -  -  -  -  -  -  -  -  -
 349
 350 =item -  -
 351
 352 =back
 353
 354 B<make -j> can run jobs in parallel, but requires a crafted Makefile
 355 to do this. That results in extra quoting to get filenames containing
 356 newlines to work correctly.
 357
 358 B<make -j> computes a dependency graph before running jobs. Jobs run
 359 by GNU B<parallel> does not depend on each other.
 360
 361 (Very early versions of GNU B<parallel> were coincidentally implemented
 362 using B<make -j>).
 363
 364 https://www.gnu.org/software/make/
 365 (Last checked: 2019-01)
 366
 367
 368 =head2 DIFFERENCES BETWEEN ppss AND GNU Parallel
 369
 370 Summary (see legend above):
 371
 372 =over
 373
 374 =item I1 I2 - - - - I7
 375
 376 =item M1 - M3 - - M6
 377
 378 =item O1 - - x - -
 379
 380 =item E1 E2 ?E3 E4 - - -
 381
 382 =item R1 R2 R3 R4 - - ?R7 ? ?
 383
 384 =item - -
 385
 386 =back
 387
 388 B<ppss> is also a tool for running jobs in parallel.
 389
 390 The output of B<ppss> is status information and thus not useful for
 391 using as input for another command. The output from the jobs are put
 392 into files.
 393
 394 The argument replace string ($ITEM) cannot be changed. Arguments must
 395 be quoted - thus arguments containing special characters (space '"&!*)
 396 may cause problems. More than one argument is not supported. Filenames
 397 containing newlines are not processed correctly. When reading input
 398 from a file null cannot be used as a terminator. B<ppss> needs to read
 399 the whole input file before starting any jobs.
 400
 401 Output and status information is stored in ppss_dir and thus requires
 402 cleanup when completed. If the dir is not removed before running
 403 B<ppss> again it may cause nothing to happen as B<ppss> thinks the
 404 task is already done. GNU B<parallel> will normally not need cleaning
 405 up if running locally and will only need cleaning up if stopped
 406 abnormally and running remote (B<--cleanup> may not complete if
 407 stopped abnormally). The example B<Parallel grep> would require extra
 408 postprocessing if written using B<ppss>.
 409
 410 For remote systems PPSS requires 3 steps: config, deploy, and
 411 start. GNU B<parallel> only requires one step.
 412
 413 =head3 EXAMPLES FROM ppss MANUAL
 414
 415 Here are the examples from B<ppss>'s manual page with the equivalent
 416 using GNU B<parallel>:
 417
 418   1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
 419
 420   1$ find /path/to/files -type f | parallel gzip
 421
 422   2$ ./ppss.sh standalone -d /path/to/files \
 423        -c 'cp "$ITEM" /destination/dir '
 424
 425   2$ find /path/to/files -type f | parallel cp {} /destination/dir
 426
 427   3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
 428
 429   3$ parallel -a list-of-urls.txt wget -q
 430
 431   4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
 432
 433   4$ parallel -a list-of-urls.txt wget -q {}
 434
 435   5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
 436        -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
 437        -n nodes.txt -o /some/output/dir --upload --download;
 438      ./ppss deploy -C config.cfg
 439      ./ppss start -C config
 440
 441   5$ # parallel does not use configs. If you want
 442      # a different username put it in nodes.txt: user@hostname
 443      find source/dir -type f |
 444        parallel --sshloginfile nodes.txt --trc {.}.mp3 \
 445          lame -a {} -o {.}.mp3 --preset standard --quiet
 446
 447   6$ ./ppss stop -C config.cfg
 448
 449   6$ killall -TERM parallel
 450
 451   7$ ./ppss pause -C config.cfg
 452
 453   7$ Press: CTRL-Z or killall -SIGTSTP parallel
 454
 455   8$ ./ppss continue -C config.cfg
 456
 457   8$ Enter: fg or killall -SIGCONT parallel
 458
 459   9$ ./ppss.sh status -C config.cfg
 460
 461   9$ killall -SIGUSR2 parallel
 462
 463 https://github.com/louwrentius/PPSS
 464 (Last checked: 2010-12)
 465
 466
 467 =head2 DIFFERENCES BETWEEN pexec AND GNU Parallel
 468
 469 Summary (see legend above):
 470
 471 =over
 472
 473 =item I1 I2 - I4 I5 - -
 474
 475 =item M1 - M3 - - M6
 476
 477 =item O1 O2 O3 - O5 O6
 478
 479 =item E1 - - E4 - E6 -
 480
 481 =item R1 - - - - R6 - - -
 482
 483 =item S1 -
 484
 485 =back
 486
 487 B<pexec> is also a tool for running jobs in parallel.
 488
 489 =head3 EXAMPLES FROM pexec MANUAL
 490
 491 Here are the examples from B<pexec>'s info page with the equivalent
 492 using GNU B<parallel>:
 493
 494   1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
 495        'echo "scale=10000;sqrt($NUM)" | bc'
 496
 497   1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
 498        bc > sqrt-{}.dat'
 499
 500   2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
 501
 502   2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
 503
 504   3$ pexec -f image.list -n auto -e B -u star.log -c -- \
 505        'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
 506
 507   3$ parallel -a image.list \
 508        'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
 509
 510   4$ pexec -r *.png -e IMG -c -o - -- \
 511        'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
 512
 513   4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
 514
 515   5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
 516
 517   5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
 518
 519   6$ for p in *.png ; do echo ${p%.png} ; done | \
 520        pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
 521
 522   6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
 523
 524   7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
 525      pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
 526
 527   7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
 528
 529   8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
 530        'pexec -j -m blockread -d $IMG | \
 531         jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
 532         pexec -j -m blockwrite -s th_$IMG'
 533
 534   8$ # Combining GNU B<parallel> and GNU B<sem>.
 535      ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
 536        'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
 537
 538      # If reading and writing is done to the same disk, this may be
 539      # faster as only one process will be either reading or writing:
 540      ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
 541        'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
 542
 543 https://www.gnu.org/software/pexec/
 544 (Last checked: 2010-12)
 545
 546
 547 =head2 DIFFERENCES BETWEEN xjobs AND GNU Parallel
 548
 549 B<xjobs> is also a tool for running jobs in parallel. It only supports
 550 running jobs on your local computer.
 551
 552 B<xjobs> deals badly with special characters just like B<xargs>. See
 553 the section B<DIFFERENCES BETWEEN xargs AND GNU Parallel>.
 554
 555 =head3 EXAMPLES FROM xjobs MANUAL
 556
 557 Here are the examples from B<xjobs>'s man page with the equivalent
 558 using GNU B<parallel>:
 559
 560   1$ ls -1 *.zip | xjobs unzip
 561
 562   1$ ls *.zip | parallel unzip
 563
 564   2$ ls -1 *.zip | xjobs -n unzip
 565
 566   2$ ls *.zip | parallel unzip >/dev/null
 567
 568   3$ find . -name '*.bak' | xjobs gzip
 569
 570   3$ find . -name '*.bak' | parallel gzip
 571
 572   4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
 573
 574   4$ ls *.jar | parallel jar tf {} '>' {}.idx
 575
 576   5$ xjobs -s script
 577
 578   5$ cat script | parallel
 579
 580   6$ mkfifo /var/run/my_named_pipe;
 581      xjobs -s /var/run/my_named_pipe &
 582      echo unzip 1.zip >> /var/run/my_named_pipe;
 583      echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
 584
 585   6$ mkfifo /var/run/my_named_pipe;
 586      cat /var/run/my_named_pipe | parallel &
 587      echo unzip 1.zip >> /var/run/my_named_pipe;
 588      echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
 589
 590 https://www.maier-komor.de/xjobs.html
 591 (Last checked: 2019-01)
 592
 593
 594 =head2 DIFFERENCES BETWEEN prll AND GNU Parallel
 595
 596 B<prll> is also a tool for running jobs in parallel. It does not
 597 support running jobs on remote computers.
 598
 599 B<prll> encourages using BASH aliases and BASH functions instead of
 600 scripts. GNU B<parallel> supports scripts directly, functions if they
 601 are exported using B<export -f>, and aliases if using B<env_parallel>.
 602
 603 B<prll> generates a lot of status information on stderr (standard
 604 error) which makes it harder to use the stderr (standard error) output
 605 of the job directly as input for another program.
 606
 607 =head3 EXAMPLES FROM prll's MANUAL
 608
 609 Here is the example from B<prll>'s man page with the equivalent
 610 using GNU B<parallel>:
 611
 612   1$ prll -s 'mogrify -flip $1' *.jpg
 613
 614   1$ parallel mogrify -flip ::: *.jpg
 615
 616 https://github.com/exzombie/prll
 617 (Last checked: 2019-01)
 618
 619
 620 =head2 DIFFERENCES BETWEEN dxargs AND GNU Parallel
 621
 622 B<dxargs> is also a tool for running jobs in parallel.
 623
 624 B<dxargs> does not deal well with more simultaneous jobs than SSHD's
 625 MaxStartups. B<dxargs> is only built for remote run jobs, but does not
 626 support transferring of files.
 627
 628 https://web.archive.org/web/20120518070250/http://www.
 629 semicomplete.com/blog/geekery/distributed-xargs.html
 630 (Last checked: 2019-01)
 631
 632
 633 =head2 DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
 634
 635 middleman(mdm) is also a tool for running jobs in parallel.
 636
 637 =head3 EXAMPLES FROM middleman's WEBSITE
 638
 639 Here are the shellscripts of
 640 https://web.archive.org/web/20110728064735/http://mdm.
 641 berlios.de/usage.html ported to GNU B<parallel>:
 642
 643   1$ seq 19 | parallel buffon -o - | sort -n > result
 644      cat files | parallel cmd
 645      find dir -execdir sem cmd {} \;
 646
 647 https://github.com/cklin/mdm
 648 (Last checked: 2019-01)
 649
 650
 651 =head2 DIFFERENCES BETWEEN xapply AND GNU Parallel
 652
 653 B<xapply> can run jobs in parallel on the local computer.
 654
 655 =head3 EXAMPLES FROM xapply's MANUAL
 656
 657 Here are the examples from B<xapply>'s man page with the equivalent
 658 using GNU B<parallel>:
 659
 660   1$ xapply '(cd %1 && make all)' */
 661
 662   1$ parallel 'cd {} && make all' ::: */
 663
 664   2$ xapply -f 'diff %1 ../version5/%1' manifest | more
 665
 666   2$ parallel diff {} ../version5/{} < manifest | more
 667
 668   3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
 669
 670   3$ parallel --link diff {1} {2} :::: manifest1 checklist1
 671
 672   4$ xapply 'indent' *.c
 673
 674   4$ parallel indent ::: *.c
 675
 676   5$ find ~ksb/bin -type f ! -perm -111 -print | \
 677        xapply -f -v 'chmod a+x' -
 678
 679   5$ find ~ksb/bin -type f ! -perm -111 -print | \
 680        parallel -v chmod a+x
 681
 682   6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
 683
 684   6$ sh <(find */ -... | parallel -s 1024 echo vi)
 685
 686   6$ find */ -... | parallel -s 1024 -Xuj1 vi
 687
 688   7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
 689
 690   7$ sh <(find ... | parallel -n5 echo vi)
 691
 692   7$ find ... | parallel -n5 -uj1 vi
 693
 694   8$ xapply -fn "" /etc/passwd
 695
 696   8$ parallel -k echo < /etc/passwd
 697
 698   9$ tr ':' '\012' < /etc/passwd | \
 699        xapply -7 -nf 'chown %1 %6' - - - - - - -
 700
 701   9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
 702
 703   10$ xapply '[ -d %1/RCS ] || echo %1' */
 704
 705   10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
 706
 707   11$ xapply -f '[ -f %1 ] && echo %1' List | ...
 708
 709   11$ parallel '[ -f {} ] && echo {}' < List | ...
 710
 711 https://www.databits.net/~ksb/msrc/local/bin/xapply/xapply.html (Last
 712 checked: 2010-12)
 713
 714
 715 =head2 DIFFERENCES BETWEEN AIX apply AND GNU Parallel
 716
 717 B<apply> can build command lines based on a template and arguments -
 718 very much like GNU B<parallel>. B<apply> does not run jobs in
 719 parallel. B<apply> does not use an argument separator (like B<:::>);
 720 instead the template must be the first argument.
 721
 722 =head3 EXAMPLES FROM IBM's KNOWLEDGE CENTER
 723
 724 Here are the examples from IBM's Knowledge Center and the
 725 corresponding command using GNU B<parallel>:
 726
 727 =head4 To obtain results similar to those of the B<ls> command, enter:
 728
 729   1$ apply echo *
 730   1$ parallel echo ::: *
 731
 732 =head4 To compare the file named a1 to the file named b1, and
 733 the file named a2 to the file named b2, enter:
 734
 735   2$ apply -2 cmp a1 b1 a2 b2
 736   2$ parallel -N2 cmp ::: a1 b1 a2 b2
 737
 738 =head4 To run the B<who> command five times, enter:
 739
 740   3$ apply -0 who 1 2 3 4 5
 741   3$ parallel -N0 who ::: 1 2 3 4 5
 742
 743 =head4 To link all files in the current directory to the directory
 744 /usr/joe, enter:
 745
 746   4$ apply 'ln %1 /usr/joe' *
 747   4$ parallel ln {} /usr/joe ::: *
 748
 749 https://www-01.ibm.com/support/knowledgecenter/
 750 ssw_aix_71/com.ibm.aix.cmds1/apply.htm
 751 (Last checked: 2019-01)
 752
 753
 754 =head2 DIFFERENCES BETWEEN paexec AND GNU Parallel
 755
 756 B<paexec> can run jobs in parallel on both the local and remote computers.
 757
 758 B<paexec> requires commands to print a blank line as the last
 759 output. This means you will have to write a wrapper for most programs.
 760
 761 B<paexec> has a job dependency facility so a job can depend on another
 762 job to be executed successfully. Sort of a poor-man's B<make>.
 763
 764 =head3 EXAMPLES FROM paexec's EXAMPLE CATALOG
 765
 766 Here are the examples from B<paexec>'s example catalog with the equivalent
 767 using GNU B<parallel>:
 768
 769 =head4 1_div_X_run
 770
 771   1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
 772
 773   1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
 774
 775 =head4 all_substr_run
 776
 777   2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
 778
 779   2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
 780
 781 =head4 cc_wrapper_run
 782
 783   3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
 784              -n 'host1 host2' \
 785              -t '/usr/bin/ssh -x' <<EOF [...]
 786
 787   3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
 788              -S host1,host2 <<EOF [...]
 789
 790      # This is not exactly the same, but avoids the wrapper
 791      parallel gcc -O2 -c -o {.}.o {} \
 792              -S host1,host2 <<EOF [...]
 793
 794 =head4 toupper_run
 795
 796   4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
 797
 798   4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
 799
 800      # Without the wrapper:
 801      parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
 802
 803 https://github.com/cheusov/paexec
 804 (Last checked: 2010-12)
 805
 806
 807 =head2 DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
 808
 809 Summary (see legend above):
 810
 811 =over
 812
 813 =item I1 - - I4 - - (I7)
 814
 815 =item M1 (M2) M3 (M4) M5 M6
 816
 817 =item - O2 O3 - O5 - - x x O10
 818
 819 =item E1 - - - - - -
 820
 821 =item - - - - - - - - -
 822
 823 =item - -
 824
 825 =back
 826
 827 (I7): Only under special circumstances. See below.
 828
 829 (M2+M4): Only if there is a single replacement string.
 830
 831 B<map> rejects input with special characters:
 832
 833   echo "The Cure" > My\ brother\'s\ 12\"\ records
 834
 835   ls | map 'echo %; wc %'
 836
 837 It works with GNU B<parallel>:
 838
 839   ls | parallel 'echo {}; wc {}'
 840
 841 Under some circumstances it also works with B<map>:
 842
 843   ls | map 'echo % works %'
 844
 845 But tiny changes make it reject the input with special characters:
 846
 847   ls | map 'echo % does not work "%"'
 848
 849 This means that many UTF-8 characters will be rejected. This is by
 850 design. From the web page: "As such, programs that I<quietly handle
 851 them, with no warnings at all,> are doing their users a disservice."
 852
 853 B<map> delays each job by 0.01 s. This can be emulated by using
 854 B<parallel --delay 0.01>.
 855
 856 B<map> prints '+' on stderr when a job starts, and '-' when a job
 857 finishes. This cannot be disabled. B<parallel> has B<--bar> if you
 858 need to see progress.
 859
 860 B<map>'s replacement strings (% %D %B %E) can be simulated in GNU
 861 B<parallel> by putting this in B<~/.parallel/config>:
 862
 863   --rpl '%'
 864   --rpl '%D $_=Q(::dirname($_));'
 865   --rpl '%B s:.*/::;s:\.[^/.]+$::;'
 866   --rpl '%E s:.*\.::'
 867
 868 B<map> does not have an argument separator on the command line, but
 869 uses the first argument as command. This makes quoting harder which again
 870 may affect readability. Compare:
 871
 872   map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
 873
 874   parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
 875
 876 B<map> can do multiple arguments with context replace, but not without
 877 context replace:
 878
 879   parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
 880
 881   map "echo 'BEGIN{'%'}END'" 1 2 3
 882
 883 B<map> has no support for grouping. So this gives the wrong results:
 884
 885   parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
 886     ::: a b c d e f
 887   ls -l a b c d e f
 888   parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
 889   map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
 890   map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
 891   map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
 892   ls -l out*
 893   md5sum out*
 894
 895 =head3 EXAMPLES FROM map's WEBSITE
 896
 897 Here are the examples from B<map>'s web page with the equivalent using
 898 GNU B<parallel>:
 899
 900   1$ ls *.gif | map convert % %B.png         # default max-args: 1
 901
 902   1$ ls *.gif | parallel convert {} {.}.png
 903
 904   2$ map "mkdir %B; tar -C %B -xf %" *.tgz   # default max-args: 1
 905
 906   2$ parallel 'mkdir {.}; tar -C {.} -xf {}' :::  *.tgz
 907
 908   3$ ls *.gif | map cp % /tmp                # default max-args: 100
 909
 910   3$ ls *.gif | parallel -X cp {} /tmp
 911
 912   4$ ls *.tar | map -n 1 tar -xf %
 913
 914   4$ ls *.tar | parallel tar -xf
 915
 916   5$ map "cp % /tmp" *.tgz
 917
 918   5$ parallel cp {} /tmp ::: *.tgz
 919
 920   6$ map "du -sm /home/%/mail" alice bob carol
 921
 922   6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
 923   or if you prefer running a single job with multiple args:
 924   6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
 925
 926   7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
 927
 928   7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
 929
 930   8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
 931
 932   8$ export PARALLEL=-j50%
 933
 934 https://github.com/sitaramc/map
 935 (Last checked: 2020-05)
 936
 937
 938 =head2 DIFFERENCES BETWEEN ladon AND GNU Parallel
 939
 940 B<ladon> can run multiple jobs on files in parallel.
 941
 942 B<ladon> only works on files and the only way to specify files is
 943 using a quoted glob string (such as \*.jpg). It is not possible to
 944 list the files manually.
 945
 946 As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
 947 RELPATH
 948
 949 These can be simulated using GNU B<parallel> by putting this in
 950 B<~/.parallel/config>:
 951
 952   --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
 953   --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
 954   --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
 955   --rpl 'EXT s:.*\.::'
 956   --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
 957          s:\Q$c/\E::;$_=::dirname($_);'
 958   --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
 959          s:\Q$c/\E::;'
 960
 961 B<ladon> deals badly with filenames containing " and newline, and it
 962 fails for output larger than 200k:
 963
 964   ladon '*' -- seq 36000 | wc
 965
 966 =head3 EXAMPLES FROM ladon MANUAL
 967
 968 It is assumed that the '--rpl's above are put in B<~/.parallel/config>
 969 and that it is run under a shell that supports '**' globbing (such as B<zsh>):
 970
 971   1$ ladon "**/*.txt" -- echo RELPATH
 972
 973   1$ parallel echo RELPATH ::: **/*.txt
 974
 975   2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
 976
 977   2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
 978
 979   3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
 980        -thumbnail 100x100^ -gravity center -extent 100x100 \
 981        thumbs/RELPATH
 982
 983   3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
 984        -thumbnail 100x100^ -gravity center -extent 100x100 \
 985        thumbs/RELPATH ::: **/*.jpg
 986
 987   4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
 988
 989   4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
 990
 991 https://github.com/danielgtaylor/ladon
 992 (Last checked: 2019-01)
 993
 994
 995 =head2 DIFFERENCES BETWEEN jobflow AND GNU Parallel
 996
 997 Summary (see legend above):
 998
 999 =over
1000
1001 =item I1 - - - - - I7
1002
1003 =item - - M3 - - (M6)
1004
1005 =item O1 O2 O3 - O5 O6 (O7) - - O10
1006
1007 =item E1 - - - - E6 -
1008
1009 =item - - - - - - - - -
1010
1011 =item - -
1012
1013 =back
1014
1015
1016 B<jobflow> can run multiple jobs in parallel.
1017
1018 Just like B<xargs> output from B<jobflow> jobs running in parallel mix
1019 together by default. B<jobflow> can buffer into files with
1020 B<-buffered> (placed in /run/shm), but these are not cleaned up if
1021 B<jobflow> dies unexpectedly (e.g. by Ctrl-C). If the total output is
1022 big (in the order of RAM+swap) it can cause the system to slow to a
1023 crawl and eventually run out of memory.
1024
1025 Just like B<xargs> redirection and composed commands require wrapping
1026 with B<bash -c>.
1027
1028 Input lines can at most be 4096 bytes.
1029
1030 B<jobflow> is faster than GNU B<parallel> but around 6 times slower
1031 than B<parallel-bash>.
1032
1033 B<jobflow> has no equivalent for B<--pipe>, or B<--sshlogin>.
1034
1035 B<jobflow> makes it possible to set resource limits on the running
1036 jobs. This can be emulated by GNU B<parallel> using B<bash>'s B<ulimit>:
1037
1038   jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
1039
1040   parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
1041
1042
1043 =head3 EXAMPLES FROM jobflow README
1044
1045   1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
1046
1047   1$ cat things.list | parallel -j8 ./mytask {}
1048
1049   2$ seq 100 | jobflow -threads=100 -exec echo {}
1050
1051   2$ seq 100 | parallel -j100 echo {}
1052
1053   3$ cat urls.txt | jobflow -threads=32 -exec wget {}
1054
1055   3$ cat urls.txt | parallel -j32 wget {}
1056
1057   4$ find . -name '*.bmp' | \
1058        jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
1059
1060   4$ find . -name '*.bmp' | \
1061        parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
1062
1063   5$ seq 100 | jobflow -skip 10 -count 10
1064
1065   5$ seq 100 | parallel --filter '{1} > 10 and {1} <= 20' echo
1066
1067   5$ seq 100 | parallel echo '{= $_>10 and $_<=20 or skip() =}'
1068
1069 https://github.com/rofl0r/jobflow
1070 (Last checked: 2022-05)
1071
1072
1073 =head2 DIFFERENCES BETWEEN gargs AND GNU Parallel
1074
1075 B<gargs> can run multiple jobs in parallel.
1076
1077 Older versions cache output in memory. This causes it to be extremely
1078 slow when the output is larger than the physical RAM, and can cause
1079 the system to run out of memory.
1080
1081 See more details on this in B<man parallel_design>.
1082
1083 Newer versions cache output in files, but leave files in $TMPDIR if it
1084 is killed.
1085
1086 Output to stderr (standard error) is changed if the command fails.
1087
1088 =head3 EXAMPLES FROM gargs WEBSITE
1089
1090   1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
1091
1092   1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
1093
1094   2$ cat t.txt | gargs --sep "\s+" \
1095        -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
1096
1097   2$ cat t.txt | parallel --colsep "\\s+" \
1098        -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
1099
1100 https://github.com/brentp/gargs
1101 (Last checked: 2016-08)
1102
1103
1104 =head2 DIFFERENCES BETWEEN orgalorg AND GNU Parallel
1105
1106 B<orgalorg> can run the same job on multiple machines. This is related
1107 to B<--onall> and B<--nonall>.
1108
1109 B<orgalorg> supports entering the SSH password - provided it is the
1110 same for all servers. GNU B<parallel> advocates using B<ssh-agent>
1111 instead, but it is possible to emulate B<orgalorg>'s behavior by
1112 setting SSHPASS and by using B<--ssh "sshpass ssh">.
1113
1114 To make the emulation easier, make a simple alias:
1115
1116   alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
1117
1118 If you want to supply a password run:
1119
1120   SSHPASS=`ssh-askpass`
1121
1122 or set the password directly:
1123
1124   SSHPASS=P4$$w0rd!
1125
1126 If the above is set up you can then do:
1127
1128   orgalorg -o frontend1 -o frontend2 -p -C uptime
1129   par_emul -S frontend1 -S frontend2 uptime
1130
1131   orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
1132   par_emul -S frontend1 -S frontend2 top -bid 1
1133
1134   orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
1135     'md5sum /tmp/bigfile' -S bigfile
1136   par_emul -S frontend1 -S frontend2 --basefile bigfile \
1137     --workdir /tmp md5sum /tmp/bigfile
1138
1139 B<orgalorg> has a progress indicator for the transferring of a
1140 file. GNU B<parallel> does not.
1141
1142 https://github.com/reconquest/orgalorg
1143 (Last checked: 2016-08)
1144
1145
1146 =head2 DIFFERENCES BETWEEN Rust parallel(mmstick) AND GNU Parallel
1147
1148 Rust parallel focuses on speed. It is almost as fast as B<xargs>, but
1149 not as fast as B<parallel-bash>. It implements a few features from GNU
1150 B<parallel>, but lacks many functions. All these fail:
1151
1152   # Read arguments from file
1153   parallel -a file echo
1154   # Changing the delimiter
1155   parallel -d _ echo ::: a_b_c_
1156
1157 These do something different from GNU B<parallel>
1158
1159   # -q to protect quoted $ and space
1160   parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
1161   # Generation of combination of inputs
1162   parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
1163   # {= perl expression =} replacement string
1164   parallel echo '{= s/new/old/ =}' ::: my.new your.new
1165   # --pipe
1166   seq 100000 | parallel --pipe wc
1167   # linked arguments
1168   parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
1169   # Run different shell dialects
1170   zsh -c 'parallel echo \={} ::: zsh && true'
1171   csh -c 'parallel echo \$\{\} ::: shell && true'
1172   bash -c 'parallel echo \$\({}\) ::: pwd && true'
1173   # Rust parallel does not start before the last argument is read
1174   (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
1175   tail -f /var/log/syslog | parallel echo
1176
1177 Most of the examples from the book GNU Parallel 2018 do not work, thus
1178 Rust parallel is not close to being a compatible replacement.
1179
1180 Rust parallel has no remote facilities.
1181
1182 It uses /tmp/parallel for tmp files and does not clean up if
1183 terminated abruptly. If another user on the system uses Rust parallel,
1184 then /tmp/parallel will have the wrong permissions and Rust parallel
1185 will fail. A malicious user can setup the right permissions and
1186 symlink the output file to one of the user's files and next time the
1187 user uses Rust parallel it will overwrite this file.
1188
1189   attacker$ mkdir /tmp/parallel
1190   attacker$ chmod a+rwX /tmp/parallel
1191   # Symlink to the file the attacker wants to zero out
1192   attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
1193   victim$ seq 1000 | parallel echo
1194   # This file is now overwritten with stderr from 'echo'
1195   victim$ cat ~victim/.important-file
1196
1197 If /tmp/parallel runs full during the run, Rust parallel does not
1198 report this, but finishes with success - thereby risking data loss.
1199
1200 https://github.com/mmstick/parallel
1201 (Last checked: 2016-08)
1202
1203
1204 =head2 DIFFERENCES BETWEEN Rush AND GNU Parallel
1205
1206 B<rush> (https://github.com/shenwei356/rush) is written in Go and
1207 based on B<gargs>.
1208
1209 Just like GNU B<parallel> B<rush> buffers in temporary files. But
1210 opposite GNU B<parallel> B<rush> does not clean up, if the process
1211 dies abnormally.
1212
1213 B<rush> has some string manipulations that can be emulated by putting
1214 this into ~/.parallel/config (/ is used instead of %, and % is used
1215 instead of ^ as that is closer to bash's ${var%postfix}):
1216
1217   --rpl '{:} s:(\.[^/]+)*$::'
1218   --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
1219   --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
1220   --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
1221   --rpl '{@(.*?)} /$$1/ and $_=$1;'
1222
1223 =head3 EXAMPLES FROM rush's WEBSITE
1224
1225 Here are the examples from B<rush>'s website with the equivalent
1226 command in GNU B<parallel>.
1227
1228 B<1. Simple run, quoting is not necessary>
1229
1230   1$ seq 1 3 | rush echo {}
1231
1232   1$ seq 1 3 | parallel echo {}
1233
1234 B<2. Read data from file (`-i`)>
1235
1236   2$ rush echo {} -i data1.txt -i data2.txt
1237
1238   2$ cat data1.txt data2.txt | parallel echo {}
1239
1240 B<3. Keep output order (`-k`)>
1241
1242   3$ seq 1 3 | rush 'echo {}' -k
1243
1244   3$ seq 1 3 | parallel -k echo {}
1245
1246
1247 B<4. Timeout (`-t`)>
1248
1249   4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
1250
1251   4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
1252
1253 B<5. Retry (`-r`)>
1254
1255   5$ seq 1 | rush 'python unexisted_script.py' -r 1
1256
1257   5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
1258
1259 Use B<-u> to see it is really run twice:
1260
1261   5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
1262
1263 B<6. Dirname (`{/}`) and basename (`{%}`) and remove custom
1264 suffix (`{^suffix}`)>
1265
1266   6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
1267
1268   6$ echo dir/file_1.txt.gz |
1269        parallel --plus echo {//} {/} {%_1.txt.gz}
1270
1271 B<7. Get basename, and remove last (`{.}`) or any (`{:}`) extension>
1272
1273   7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
1274
1275   7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
1276
1277 B<8. Job ID, combine fields index and other replacement strings>
1278
1279   8$ echo 12 file.txt dir/s_1.fq.gz |
1280        rush 'echo job {#}: {2} {2.} {3%:^_1}'
1281
1282   8$ echo 12 file.txt dir/s_1.fq.gz |
1283        parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1284
1285 B<9. Capture submatch using regular expression (`{@regexp}`)>
1286
1287   9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1288
1289   9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1290
1291 B<10. Custom field delimiter (`-d`)>
1292
1293   10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1294
1295   10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1296
1297 B<11. Send multi-lines to every command (`-n`)>
1298
1299   11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1300
1301   11$ seq 5 |
1302         parallel -n 2 -k \
1303           'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1304
1305   11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1306
1307   11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1308
1309
1310 B<12. Custom record delimiter (`-D`), note that empty records are not used.>
1311
1312   12$ echo a b c d | rush -D " " -k 'echo {}'
1313
1314   12$ echo a b c d | parallel -d " " -k 'echo {}'
1315
1316   12$ echo abcd | rush -D "" -k 'echo {}'
1317
1318   Cannot be done by GNU Parallel
1319
1320   12$ cat fasta.fa
1321   >seq1
1322   tag
1323   >seq2
1324   cat
1325   gat
1326   >seq3
1327   attac
1328   a
1329   cat
1330
1331   12$ cat fasta.fa | rush -D ">" \
1332         'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1333       # rush fails to join the multiline sequences
1334
1335   12$ cat fasta.fa | (read -n1 ignore_first_char;
1336         parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1337           name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1338       )
1339
1340 B<13. Assign value to variable, like `awk -v` (`-v`)>
1341
1342   13$ seq 1 |
1343         rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1344
1345   13$ seq 1 |
1346         parallel -N0 \
1347           'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1348
1349   13$ for var in a b; do \
1350   13$   seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1351   13$ done
1352
1353 In GNU B<parallel> you would typically do:
1354
1355   13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1356
1357 If you I<really> want the var:
1358
1359   13$ seq 1 3 |
1360         parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1361
1362 If you I<really> want the B<for>-loop:
1363
1364   13$ for var in a b; do
1365         export var;
1366         seq 1 3 | parallel -k 'echo var: $var, data: {}';
1367       done
1368
1369 Contrary to B<rush> this also works if the value is complex like:
1370
1371   My brother's 12" records
1372
1373
1374 B<14. Preset variable (`-v`), avoid repeatedly writing verbose replacement strings>
1375
1376   14$ # naive way
1377       echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1378
1379   14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1380
1381   14$ # macro + removing suffix
1382       echo read_1.fq.gz |
1383         rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1384
1385   14$ echo read_1.fq.gz |
1386         parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1387
1388   14$ # macro + regular expression
1389       echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1390
1391   14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1392
1393 Contrary to B<rush> GNU B<parallel> works with complex values:
1394
1395   14$ echo "My brother's 12\"read_1.fq.gz" |
1396         parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1397
1398 B<15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and exit.>
1399
1400   15$ seq 1 20 | rush 'sleep 1; echo {}'
1401       ^C
1402
1403   15$ seq 1 20 | parallel 'sleep 1; echo {}'
1404       ^C
1405
1406 B<16. Continue/resume jobs (`-c`). When some jobs failed (by
1407 execution failure, timeout, or canceling by user with `Ctrl + C`),
1408 please switch flag `-c/--continue` on and run again, so that `rush`
1409 can save successful commands and ignore them in I<NEXT> run.>
1410
1411   16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1412       cat successful_cmds.rush
1413       seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1414
1415   16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1416         'sleep {}; echo {}'
1417       cat mylog
1418       seq 1 3 | parallel --joblog mylog --retry-failed \
1419         'sleep {}; echo {}'
1420
1421 Multi-line jobs:
1422
1423   16$ seq 1 3 | rush 'sleep {}; echo {}; \
1424         echo finish {}' -t 3 -c -C finished.rush
1425       cat finished.rush
1426       seq 1 3 | rush 'sleep {}; echo {}; \
1427         echo finish {}' -t 3 -c -C finished.rush
1428
1429   16$ seq 1 3 |
1430         parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1431           echo finish {}'
1432       cat mylog
1433       seq 1 3 |
1434         parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1435           echo finish {}'
1436
1437 B<17. A comprehensive example: downloading 1K+ pages given by
1438 three URL list files using `phantomjs save_page.js` (some page
1439 contents are dynamically generated by Javascript, so `wget` does not
1440 work). Here I set max jobs number (`-j`) as `20`, each job has a max
1441 running time (`-t`) of `60` seconds and `3` retry changes
1442 (`-r`). Continue flag `-c` is also switched on, so we can continue
1443 unfinished jobs. Luckily, it's accomplished in one run :)>
1444
1445   17$ for f in $(seq 2014 2016); do \
1446         /bin/rm -rf $f; mkdir -p $f; \
1447         cat $f.html.txt | rush -v d=$f -d = \
1448           'phantomjs save_page.js "{}" > {d}/{3}.html' \
1449           -j 20 -t 60 -r 3 -c; \
1450       done
1451
1452 GNU B<parallel> can append to an existing joblog with '+':
1453
1454   17$ rm mylog
1455       for f in $(seq 2014 2016); do
1456         /bin/rm -rf $f; mkdir -p $f;
1457         cat $f.html.txt |
1458           parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1459             --colsep = \
1460             phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1461       done
1462
1463 B<18. A bioinformatics example: mapping with `bwa`, and
1464 processing result with `samtools`:>
1465
1466   18$ ref=ref/xxx.fa
1467       threads=25
1468       ls -d raw.cluster.clean.mapping/* \
1469         | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1470         'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1471         samtools view -bS {p}.sam > {p}.bam; \
1472         samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1473         samtools index {p}.sorted.bam; \
1474         samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1475         /bin/rm {p}.bam {p}.sam;' \
1476         -j 2 --verbose -c -C mapping.rush
1477
1478 GNU B<parallel> would use a function:
1479
1480   18$ ref=ref/xxx.fa
1481       export ref
1482       thr=25
1483       export thr
1484       bwa_sam() {
1485         p="$1"
1486         bam="$p".bam
1487         sam="$p".sam
1488         sortbam="$p".sorted.bam
1489         bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1490         samtools view -bS "$sam" > "$bam"
1491         samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1492         samtools index "$sortbam"
1493         samtools flagstat "$sortbam" > "$sortbam".flagstat
1494         /bin/rm "$bam" "$sam"
1495       }
1496       export -f bwa_sam
1497       ls -d raw.cluster.clean.mapping/* |
1498         parallel -j 2 --verbose --joblog mylog bwa_sam
1499
1500 =head3 Other B<rush> features
1501
1502 B<rush> has:
1503
1504 =over 4
1505
1506 =item * B<awk -v> like custom defined variables (B<-v>)
1507
1508 With GNU B<parallel> you would simply set a shell variable:
1509
1510    parallel 'v={}; echo "$v"' ::: foo
1511    echo foo | rush -v v={} 'echo {v}'
1512
1513 Also B<rush> does not like special chars. So these B<do not work>:
1514
1515    echo does not work | rush -v v=\" 'echo {v}'
1516    echo "My  brother's  12\"  records" | rush -v v={} 'echo {v}'
1517
1518 Whereas the corresponding GNU B<parallel> version works:
1519
1520    parallel 'v=\"; echo "$v"' ::: works
1521    parallel 'v={}; echo "$v"' ::: "My  brother's  12\"  records"
1522
1523 =item * Exit on first error(s) (-e)
1524
1525 This is called B<--halt now,fail=1> (or shorter: B<--halt 2>) when
1526 used with GNU B<parallel>.
1527
1528 =item * Settable records sending to every command (B<-n>, default 1)
1529
1530 This is also called B<-n> in GNU B<parallel>.
1531
1532 =item * Practical replacement strings
1533
1534 =over 4
1535
1536 =item {:} remove any extension
1537
1538 With GNU B<parallel> this can be emulated by:
1539
1540   parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1541
1542 =item {^suffix}, remove suffix
1543
1544 With GNU B<parallel> this can be emulated by:
1545
1546   parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1547
1548 =item {@regexp}, capture submatch using regular expression
1549
1550 With GNU B<parallel> this can be emulated by:
1551
1552   parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1553     echo '{@\d_(.*).gz}' ::: 1_foo.gz
1554
1555 =item {%.}, {%:}, basename without extension
1556
1557 With GNU B<parallel> this can be emulated by:
1558
1559   parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1560
1561 And if you need it often, you define a B<--rpl> in
1562 B<$HOME/.parallel/config>:
1563
1564   --rpl '{%.} s:.*/::;s/\..*//'
1565   --rpl '{%:} s:.*/::;s/\..*//'
1566
1567 Then you can use them as:
1568
1569   parallel echo {%.} {%:} ::: dir/foo.bar.gz
1570
1571 =back
1572
1573 =item * Preset variable (macro)
1574
1575 E.g.
1576
1577   echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1578
1579 With GNU B<parallel> this can be emulated by:
1580
1581   echo foosuffix |
1582     parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1583
1584 Opposite B<rush> GNU B<parallel> works fine if the input contains
1585 double space, ' and ":
1586
1587   echo "1'6\"  foosuffix" |
1588     parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1589
1590
1591 =item * Commands of multi-lines
1592
1593 While you I<can> use multi-lined commands in GNU B<parallel>, to
1594 improve readability GNU B<parallel> discourages the use of multi-line
1595 commands. In most cases it can be written as a function:
1596
1597   seq 1 3 |
1598     parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1599       echo finish {}'
1600
1601 Could be written as:
1602
1603   doit() {
1604     sleep "$1"
1605     echo "$1"
1606     echo finish "$1"
1607   }
1608   export -f doit
1609   seq 1 3 | parallel --timeout 2 --joblog my.log doit
1610
1611 The failed commands can be resumed with:
1612
1613   seq 1 3 |
1614     parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1615       echo finish {}'
1616
1617 =back
1618
1619 https://github.com/shenwei356/rush
1620 (Last checked: 2017-05)
1621
1622
1623 =head2 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1624
1625 ClusterSSH solves a different problem than GNU B<parallel>.
1626
1627 ClusterSSH opens a terminal window for each computer and using a
1628 master window you can run the same command on all the computers. This
1629 is typically used for administrating several computers that are almost
1630 identical.
1631
1632 GNU B<parallel> runs the same (or different) commands with different
1633 arguments in parallel possibly using remote computers to help
1634 computing. If more than one computer is listed in B<-S> GNU B<parallel> may
1635 only use one of these (e.g. if there are 8 jobs to be run and one
1636 computer has 8 cores).
1637
1638 GNU B<parallel> can be used as a poor-man's version of ClusterSSH:
1639
1640 B<parallel --nonall -S server-a,server-b do_stuff foo bar>
1641
1642 https://github.com/duncs/clusterssh
1643 (Last checked: 2010-12)
1644
1645
1646 =head2 DIFFERENCES BETWEEN coshell AND GNU Parallel
1647
1648 B<coshell> only accepts full commands on standard input. Any quoting
1649 needs to be done by the user.
1650
1651 Commands are run in B<sh> so any B<bash>/B<tcsh>/B<zsh> specific
1652 syntax will not work.
1653
1654 Output can be buffered by using B<-d>. Output is buffered in memory,
1655 so big output can cause swapping and therefore be terrible slow or
1656 even cause out of memory.
1657
1658 https://github.com/gdm85/coshell
1659 (Last checked: 2019-01)
1660
1661
1662 =head2 DIFFERENCES BETWEEN spread AND GNU Parallel
1663
1664 B<spread> runs commands on all directories.
1665
1666 It can be emulated with GNU B<parallel> using this Bash function:
1667
1668   spread() {
1669     _cmds() {
1670       perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1671     }
1672     parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1673   }
1674
1675 This works except for the B<--exclude> option.
1676
1677 (Last checked: 2017-11)
1678
1679
1680 =head2 DIFFERENCES BETWEEN pyargs AND GNU Parallel
1681
1682 B<pyargs> deals badly with input containing spaces. It buffers stdout,
1683 but not stderr. It buffers in RAM. {} does not work as replacement
1684 string. It does not support running functions.
1685
1686 B<pyargs> does not support composed commands if run with B<--lines>,
1687 and fails on B<pyargs traceroute gnu.org fsf.org>.
1688
1689 =head3 Examples
1690
1691   seq 5 | pyargs -P50 -L seq
1692   seq 5 | parallel -P50 --lb seq
1693
1694   seq 5 | pyargs -P50 --mark -L seq
1695   seq 5 | parallel -P50 --lb \
1696     --tagstring OUTPUT'[{= $_=$job->replaced() =}]' seq
1697   # Similar, but not precisely the same
1698   seq 5 | parallel -P50 --lb --tag seq
1699
1700   seq 5 | pyargs -P50  --mark command
1701   # Somewhat longer with GNU Parallel due to the special
1702   #   --mark formatting
1703   cmd="$(echo "command" | parallel --shellquote)"
1704   wrap_cmd() {
1705      echo "MARK $cmd $@================================" >&3
1706      echo "OUTPUT START[$cmd $@]:"
1707      eval $cmd "$@"
1708      echo "OUTPUT END[$cmd $@]"
1709   }
1710   (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1711   # Similar, but not exactly the same
1712   seq 5 | parallel -t --tag command
1713
1714   (echo '1  2  3';echo 4 5 6) | pyargs  --stream seq
1715   (echo '1  2  3';echo 4 5 6) | perl -pe 's/\n/ /' |
1716     parallel -r -d' ' seq
1717   # Similar, but not exactly the same
1718   parallel seq ::: 1 2 3 4 5 6
1719
1720 https://github.com/robertblackwell/pyargs
1721 (Last checked: 2019-01)
1722
1723
1724 =head2 DIFFERENCES BETWEEN concurrently AND GNU Parallel
1725
1726 B<concurrently> runs jobs in parallel.
1727
1728 The output is prepended with the job number, and may be incomplete:
1729
1730   $ concurrently 'seq 100000' | (sleep 3;wc -l)
1731   7165
1732
1733 When pretty printing it caches output in memory. Output mixes by using
1734 test MIX below whether or not output is cached.
1735
1736 There seems to be no way of making a template command and have
1737 B<concurrently> fill that with different args. The full commands must
1738 be given on the command line.
1739
1740 There is also no way of controlling how many jobs should be run in
1741 parallel at a time - i.e. "number of jobslots". Instead all jobs are
1742 simply started in parallel.
1743
1744 https://github.com/kimmobrunfeldt/concurrently
1745 (Last checked: 2019-01)
1746
1747
1748 =head2 DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1749
1750 B<map> does not run jobs in parallel by default. The README suggests using:
1751
1752   ... | map t 'sleep $t && say done &'
1753
1754 But this fails if more jobs are run in parallel than the number of
1755 available processes. Since there is no support for parallelization in
1756 B<map> itself, the output also mixes:
1757
1758   seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1759
1760 The major difference is that GNU B<parallel> is built for parallelization
1761 and B<map> is not. So GNU B<parallel> has lots of ways of dealing with the
1762 issues that parallelization raises:
1763
1764 =over 4
1765
1766 =item *
1767
1768 Keep the number of processes manageable
1769
1770 =item *
1771
1772 Make sure output does not mix
1773
1774 =item *
1775
1776 Make Ctrl-C kill all running processes
1777
1778 =back
1779
1780 =head3 EXAMPLES FROM maps WEBSITE
1781
1782 Here are the 5 examples converted to GNU Parallel:
1783
1784   1$ ls *.c | map f 'foo $f'
1785   1$ ls *.c | parallel foo
1786
1787   2$ ls *.c | map f 'foo $f; bar $f'
1788   2$ ls *.c | parallel 'foo {}; bar {}'
1789
1790   3$ cat urls | map u 'curl -O $u'
1791   3$ cat urls | parallel curl -O
1792
1793   4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1794   4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1795   4$ parallel 'sleep {} && say done' ::: 1 1 1
1796
1797   5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1798   5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1799   5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1800
1801 https://github.com/soveran/map
1802 (Last checked: 2019-01)
1803
1804
1805 =head2 DIFFERENCES BETWEEN loop AND GNU Parallel
1806
1807 B<loop> mixes stdout and stderr:
1808
1809     loop 'ls /no-such-file' >/dev/null
1810
1811 B<loop>'s replacement string B<$ITEM> does not quote strings:
1812
1813     echo 'two  spaces' | loop 'echo $ITEM'
1814
1815 B<loop> cannot run functions:
1816
1817     myfunc() { echo joe; }
1818     export -f myfunc
1819     loop 'myfunc this fails'
1820
1821 =head3 EXAMPLES FROM loop's WEBSITE
1822
1823 Some of the examples from https://github.com/Miserlou/Loop/ can be
1824 emulated with GNU B<parallel>:
1825
1826     # A couple of functions will make the code easier to read
1827     $ loopy() {
1828         yes | parallel -uN0 -j1 "$@"
1829       }
1830     $ export -f loopy
1831     $ time_out() {
1832         parallel -uN0 -q --timeout "$@" ::: 1
1833       }
1834     $ match() {
1835         perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1836       }
1837     $ export -f match
1838
1839     $ loop 'ls' --every 10s
1840     $ loopy --delay 10s ls
1841
1842     $ loop 'touch $COUNT.txt' --count-by 5
1843     $ loopy touch '{= $_=seq()*5 =}'.txt
1844
1845     $ loop --until-contains 200 -- \
1846         ./get_response_code.sh --site mysite.biz`
1847     $ loopy --halt now,success=1 \
1848         './get_response_code.sh --site mysite.biz | match 200'
1849
1850     $ loop './poke_server' --for-duration 8h
1851     $ time_out 8h loopy ./poke_server
1852
1853     $ loop './poke_server' --until-success
1854     $ loopy --halt now,success=1 ./poke_server
1855
1856     $ cat files_to_create.txt | loop 'touch $ITEM'
1857     $ cat files_to_create.txt | parallel touch {}
1858
1859     $ loop 'ls' --for-duration 10min --summary
1860     # --joblog is somewhat more verbose than --summary
1861     $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1862
1863     $ loop 'echo hello'
1864     $ loopy echo hello
1865
1866     $ loop 'echo $COUNT'
1867     # GNU Parallel counts from 1
1868     $ loopy echo {#}
1869     # Counting from 0 can be forced
1870     $ loopy echo '{= $_=seq()-1 =}'
1871
1872     $ loop 'echo $COUNT' --count-by 2
1873     $ loopy echo '{= $_=2*(seq()-1) =}'
1874
1875     $ loop 'echo $COUNT' --count-by 2 --offset 10
1876     $ loopy echo '{= $_=10+2*(seq()-1) =}'
1877
1878     $ loop 'echo $COUNT' --count-by 1.1
1879     # GNU Parallel rounds 3.3000000000000003 to 3.3
1880     $ loopy echo '{= $_=1.1*(seq()-1) =}'
1881
1882     $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1883     $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1884
1885     $ loop 'echo $COUNT' --num 3 --summary
1886     # --joblog is somewhat more verbose than --summary
1887     $ seq 3 | parallel --joblog my.log echo; cat my.log
1888
1889     $ loop 'ls -foobarbatz' --num 3 --summary
1890     # --joblog is somewhat more verbose than --summary
1891     $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1892
1893     $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1894     # Can be emulated by running 2 jobs
1895     $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1896     $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1897
1898     $ loop 'date' --every 5s
1899     $ loopy --delay 5s date
1900
1901     $ loop 'date' --for-duration 8s --every 2s
1902     $ time_out 8s loopy --delay 2s date
1903
1904     $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1905     $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date  +%s`))s
1906     $ time_out $seconds loopy --delay 5s date -u
1907
1908     $ loop 'echo $RANDOM' --until-contains "666"
1909     $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1910
1911     $ loop 'if (( RANDOM % 2 )); then
1912               (echo "TRUE"; true);
1913             else
1914               (echo "FALSE"; false);
1915             fi' --until-success
1916     $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1917                                     (echo "TRUE"; true);
1918                                   else
1919                                     (echo "FALSE"; false);
1920                                   fi'
1921
1922     $ loop 'if (( RANDOM % 2 )); then
1923         (echo "TRUE"; true);
1924       else
1925         (echo "FALSE"; false);
1926       fi' --until-error
1927     $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1928                                  (echo "TRUE"; true);
1929                                else
1930                                  (echo "FALSE"; false);
1931                                fi'
1932
1933     $ loop 'date' --until-match "(\d{4})"
1934     $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1935
1936     $ loop 'echo $ITEM' --for red,green,blue
1937     $ parallel echo ::: red green blue
1938
1939     $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1940     $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1941
1942     $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1943     $ ls | parallel cp {} {}.bak; ls
1944
1945     $ loop 'echo $ITEM | tr a-z A-Z' -i
1946     $ parallel 'echo {} | tr a-z A-Z'
1947     # Or more efficiently:
1948     $ parallel --pipe tr a-z A-Z
1949
1950     $ loop 'echo $ITEM' --for "`ls`"
1951     $ parallel echo {} ::: "`ls`"
1952
1953     $ ls | loop './my_program $ITEM' --until-success;
1954     $ ls | parallel --halt now,success=1 ./my_program {}
1955
1956     $ ls | loop './my_program $ITEM' --until-fail;
1957     $ ls | parallel --halt now,fail=1 ./my_program {}
1958
1959     $ ./deploy.sh;
1960       loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1961         --every 5s --until-contains 200;
1962       ./announce_to_slack.sh
1963     $ ./deploy.sh;
1964       loopy --delay 5s --halt now,success=1 \
1965       'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1966       ./announce_to_slack.sh
1967
1968     $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1969     $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1970
1971     $ ./create_big_file -o my_big_file.bin;
1972       loop 'ls' --until-contains 'my_big_file.bin';
1973       ./upload_big_file my_big_file.bin
1974     # inotifywait is a better tool to detect file system changes.
1975     # It can even make sure the file is complete
1976     # so you are not uploading an incomplete file
1977     $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1978         grep my_big_file.bin
1979
1980     $ ls | loop 'cp $ITEM $ITEM.bak'
1981     $ ls | parallel cp {} {}.bak
1982
1983     $ loop './do_thing.sh' --every 15s --until-success --num 5
1984     $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
1985
1986 https://github.com/Miserlou/Loop/
1987 (Last checked: 2018-10)
1988
1989
1990 =head2 DIFFERENCES BETWEEN lorikeet AND GNU Parallel
1991
1992 B<lorikeet> can run jobs in parallel. It does this based on a
1993 dependency graph described in a file, so this is similar to B<make>.
1994
1995 https://github.com/cetra3/lorikeet
1996 (Last checked: 2018-10)
1997
1998
1999 =head2 DIFFERENCES BETWEEN spp AND GNU Parallel
2000
2001 B<spp> can run jobs in parallel. B<spp> does not use a command
2002 template to generate the jobs, but requires jobs to be in a
2003 file. Output from the jobs mix.
2004
2005 https://github.com/john01dav/spp
2006 (Last checked: 2019-01)
2007
2008
2009 =head2 DIFFERENCES BETWEEN paral AND GNU Parallel
2010
2011 B<paral> prints a lot of status information and stores the output from
2012 the commands run into files. This means it cannot be used the middle
2013 of a pipe like this
2014
2015   paral "echo this" "echo does not" "echo work" | wc
2016
2017 Instead it puts the output into files named like
2018 B<out_#_I<command>.out.log>. To get a very similar behaviour with GNU
2019 B<parallel> use B<--results
2020 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta>
2021
2022 B<paral> only takes arguments on the command line and each argument
2023 should be a full command. Thus it does not use command templates.
2024
2025 This limits how many jobs it can run in total, because they all need
2026 to fit on a single command line.
2027
2028 B<paral> has no support for running jobs remotely.
2029
2030 =head3 EXAMPLES FROM README.markdown
2031
2032 The examples from B<README.markdown> and the corresponding command run
2033 with GNU B<parallel> (B<--results
2034 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta> is omitted from
2035 the GNU B<parallel> command):
2036
2037   1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
2038   1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
2039
2040   2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
2041        "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
2042   2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
2043        "sleep 3 && echo c3" "sleep 4 && echo c4"  "sleep 5 && echo c5"
2044      # Or shorter:
2045      parallel "sleep {} && echo c{}" ::: {1..5}
2046
2047   3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2048        "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2049   3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
2050        "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2051      # Or shorter:
2052      parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
2053
2054   4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2055        "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2056   4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
2057
2058   5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2059        "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2060   5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
2061
2062   6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2063        "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2064   6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
2065
2066   7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2067        echo c && sleep 0.5 && echo d && sleep 0.5 && \
2068        echo e && sleep 0.5 && echo f && sleep 0.5 && \
2069        echo g && sleep 0.5 && echo h"
2070   7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2071        echo c && sleep 0.5 && echo d && sleep 0.5 && \
2072        echo e && sleep 0.5 && echo f && sleep 0.5 && \
2073        echo g && sleep 0.5 && echo h"
2074
2075 https://github.com/amattn/paral
2076 (Last checked: 2019-01)
2077
2078
2079 =head2 DIFFERENCES BETWEEN concurr AND GNU Parallel
2080
2081 B<concurr> is built to run jobs in parallel using a client/server
2082 model.
2083
2084 =head3 EXAMPLES FROM README.md
2085
2086 The examples from B<README.md>:
2087
2088   1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
2089   1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
2090
2091   2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
2092   2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
2093
2094   3$ concurr 'echo {}' < input_file
2095   3$ parallel 'echo {}' < input_file
2096
2097   4$ cat file | concurr 'echo {}'
2098   4$ cat file | parallel 'echo {}'
2099
2100 B<concurr> deals badly empty input files and with output larger than
2101 64 KB.
2102
2103 https://github.com/mmstick/concurr
2104 (Last checked: 2019-01)
2105
2106
2107 =head2 DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
2108
2109 B<lesser-parallel> is the inspiration for B<parallel --embed>. Both
2110 B<lesser-parallel> and B<parallel --embed> define bash functions that
2111 can be included as part of a bash script to run jobs in parallel.
2112
2113 B<lesser-parallel> implements a few of the replacement strings, but
2114 hardly any options, whereas B<parallel --embed> gives you the full
2115 GNU B<parallel> experience.
2116
2117 https://github.com/kou1okada/lesser-parallel
2118 (Last checked: 2019-01)
2119
2120
2121 =head2 DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
2122
2123 B<npm-parallel> can run npm tasks in parallel.
2124
2125 There are no examples and very little documentation, so it is hard to
2126 compare to GNU B<parallel>.
2127
2128 https://github.com/spion/npm-parallel
2129 (Last checked: 2019-01)
2130
2131
2132 =head2 DIFFERENCES BETWEEN machma AND GNU Parallel
2133
2134 B<machma> runs tasks in parallel. It gives time stamped
2135 output. It buffers in RAM.
2136
2137 =head3 EXAMPLES FROM README.md
2138
2139 The examples from README.md:
2140
2141   1$ # Put shorthand for timestamp in config for the examples
2142      echo '--rpl '\
2143        \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
2144        > ~/.parallel/machma
2145      echo '--line-buffer --tagstring "{#} {time} {}"' \
2146        >> ~/.parallel/machma
2147
2148   2$ find . -iname '*.jpg' |
2149        machma --  mogrify -resize 1200x1200 -filter Lanczos {}
2150      find . -iname '*.jpg' |
2151        parallel --bar -Jmachma mogrify -resize 1200x1200 \
2152          -filter Lanczos {}
2153
2154   3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
2155   3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
2156
2157   4$ cat /tmp/ips |
2158        machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
2159   4$ cat /tmp/ips |
2160        parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
2161
2162   5$ find . -iname '*.jpg' |
2163        machma --timeout 5s -- mogrify -resize 1200x1200 \
2164          -filter Lanczos {}
2165   5$ find . -iname '*.jpg' |
2166        parallel --timeout 5s --bar mogrify -resize 1200x1200 \
2167          -filter Lanczos {}
2168
2169   6$ find . -iname '*.jpg' -print0 |
2170        machma --null --  mogrify -resize 1200x1200 -filter Lanczos {}
2171   6$ find . -iname '*.jpg' -print0 |
2172        parallel --null --bar mogrify -resize 1200x1200 \
2173          -filter Lanczos {}
2174
2175 https://github.com/fd0/machma
2176 (Last checked: 2019-06)
2177
2178
2179 =head2 DIFFERENCES BETWEEN interlace AND GNU Parallel
2180
2181 Summary (see legend above):
2182
2183 =over
2184
2185 =item - I2 I3 I4 - - -
2186
2187 =item M1 - M3 - - M6
2188
2189 =item - O2 O3 - - - - x x
2190
2191 =item E1 E2 - - - - -
2192
2193 =item - - - - - - - - -
2194
2195 =item - -
2196
2197 =back
2198
2199 B<interlace> is built for network analysis to run network tools in parallel.
2200
2201 B<interface> does not buffer output, so output from different jobs mixes.
2202
2203 The overhead for each target is O(n*n), so with 1000 targets it
2204 becomes very slow with an overhead in the order of 500ms/target.
2205
2206 =head3 EXAMPLES FROM interlace's WEBSITE
2207
2208 Using B<prips> most of the examples from
2209 https://github.com/codingo/Interlace can be run with GNU B<parallel>:
2210
2211 Blocker
2212
2213   commands.txt:
2214     mkdir -p _output_/_target_/scans/
2215     _blocker_
2216     nmap _target_ -oA _output_/_target_/scans/_target_-nmap
2217   interlace -tL ./targets.txt -cL commands.txt -o $output
2218
2219   parallel -a targets.txt \
2220     mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
2221
2222 Blocks
2223
2224   commands.txt:
2225     _block:nmap_
2226     mkdir -p _target_/output/scans/
2227     nmap _target_ -oN _target_/output/scans/_target_-nmap
2228     _block:nmap_
2229     nikto --host _target_
2230   interlace -tL ./targets.txt -cL commands.txt
2231
2232   _nmap() {
2233     mkdir -p $1/output/scans/
2234     nmap $1 -oN $1/output/scans/$1-nmap
2235   }
2236   export -f _nmap
2237   parallel ::: _nmap "nikto --host" :::: targets.txt
2238
2239 Run Nikto Over Multiple Sites
2240
2241   interlace -tL ./targets.txt -threads 5 \
2242     -c "nikto --host _target_ > ./_target_-nikto.txt" -v
2243
2244   parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
2245
2246 Run Nikto Over Multiple Sites and Ports
2247
2248   interlace -tL ./targets.txt -threads 5 -c \
2249     "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
2250     -p 80,443 -v
2251
2252   parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
2253     :::: targets.txt ::: 80 443
2254
2255 Run a List of Commands against Target Hosts
2256
2257   commands.txt:
2258     nikto --host _target_:_port_ > _output_/_target_-nikto.txt
2259     sslscan _target_:_port_ >  _output_/_target_-sslscan.txt
2260     testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
2261   interlace -t example.com -o ~/Engagements/example/ \
2262     -cL ./commands.txt -p 80,443
2263
2264   parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
2265     ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
2266
2267 CIDR notation with an application that doesn't support it
2268
2269   interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
2270     -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2271
2272   prips 192.168.12.0/24 |
2273     parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2274
2275 Glob notation with an application that doesn't support it
2276
2277   interlace -t 192.168.12.* -c "vhostscan _target_ \
2278     -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2279
2280   # Glob is not supported in prips
2281   prips 192.168.12.0/24 |
2282     parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2283
2284 Dash (-) notation with an application that doesn't support it
2285
2286   interlace -t 192.168.12.1-15 -c \
2287     "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2288     -o ~/scans/ -threads 50
2289
2290   # Dash notation is not supported in prips
2291   prips 192.168.12.1 192.168.12.15 |
2292     parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2293
2294 Threading Support for an application that doesn't support it
2295
2296   interlace -tL ./target-list.txt -c \
2297     "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
2298     -o ~/scans/ -threads 50
2299
2300   cat ./target-list.txt |
2301     parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
2302
2303 alternatively
2304
2305   ./vhosts-commands.txt:
2306     vhostscan -t $target -oN _output_/_target_-vhosts.txt
2307   interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
2308     -threads 50 -o ~/scans
2309
2310   ./vhosts-commands.txt:
2311     vhostscan -t "$1" -oN "$2"
2312   parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
2313     :::: ./target-list.txt
2314
2315 Exclusions
2316
2317   interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
2318     "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2319     -o ~/scans/ -threads 50
2320
2321   prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
2322     parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2323
2324 Run Nikto Using Multiple Proxies
2325
2326    interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
2327      "nikto --host _target_:_port_ -useproxy _proxy_ > \
2328       ./_target_-_port_-nikto.txt" -p 80,443 -v
2329
2330    parallel -j5 \
2331      "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
2332      :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
2333
2334 https://github.com/codingo/Interlace
2335 (Last checked: 2019-09)
2336
2337
2338 =head2 DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
2339
2340 I have been unable to get the code to run at all. It seems unfinished.
2341
2342 https://github.com/otonvm/Parallel
2343 (Last checked: 2019-02)
2344
2345
2346 =head2 DIFFERENCES BETWEEN k-bx par AND GNU Parallel
2347
2348 B<par> requires Haskell to work. This limits the number of platforms
2349 this can work on.
2350
2351 B<par> does line buffering in memory. The memory usage is 3x the
2352 longest line (compared to 1x for B<parallel --lb>). Commands must be
2353 given as arguments. There is no template.
2354
2355 These are the examples from https://github.com/k-bx/par with the
2356 corresponding GNU B<parallel> command.
2357
2358   par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2359       "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2360   parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2361       "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2362
2363   par "echo foo; sleep 1; foofoo" \
2364       "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2365   parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
2366       "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2367
2368   par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
2369   parallel --lb --colsep , --tagstring {1} {2} \
2370     ::: "[fooechoer],echo foo" "[bar],echo bar"
2371
2372   par --succeed "foo" "bar" && echo 'wow'
2373   parallel "foo" "bar"; true && echo 'wow'
2374
2375 https://github.com/k-bx/par
2376 (Last checked: 2019-02)
2377
2378 =head2 DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2379
2380 B<parallelshell> does not allow for composed commands:
2381
2382   # This does not work
2383   parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2384
2385 Instead you have to wrap that in a shell:
2386
2387   parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2388
2389 It buffers output in RAM. All commands must be given on the command
2390 line and all commands are started in parallel at the same time. This
2391 will cause the system to freeze if there are so many jobs that there
2392 is not enough memory to run them all at the same time.
2393
2394 https://github.com/keithamus/parallelshell
2395 (Last checked: 2019-02)
2396
2397 https://github.com/darkguy2008/parallelshell
2398 (Last checked: 2019-03)
2399
2400
2401 =head2 DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2402
2403 B<shell-executor> does not allow for composed commands:
2404
2405   # This does not work
2406   sx 'echo foo;echo bar' 'echo baz;echo quuz'
2407
2408 Instead you have to wrap that in a shell:
2409
2410   sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2411
2412 It buffers output in RAM. All commands must be given on the command
2413 line and all commands are started in parallel at the same time. This
2414 will cause the system to freeze if there are so many jobs that there
2415 is not enough memory to run them all at the same time.
2416
2417 https://github.com/royriojas/shell-executor
2418 (Last checked: 2019-02)
2419
2420
2421 =head2 DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2422
2423 B<par> buffers in memory to avoid mixing of jobs. It takes 1s per 1
2424 million output lines.
2425
2426 B<par> needs to have all commands before starting the first job. The
2427 jobs are read from stdin (standard input) so any quoting will have to
2428 be done by the user.
2429
2430 Stdout (standard output) is prepended with o:. Stderr (standard error)
2431 is sendt to stdout (standard output) and prepended with e:.
2432
2433 For short jobs with little output B<par> is 20% faster than GNU
2434 B<parallel> and 60% slower than B<xargs>.
2435
2436 https://github.com/UnixJunkie/PAR
2437
2438 https://savannah.nongnu.org/projects/par
2439 (Last checked: 2019-02)
2440
2441
2442 =head2 DIFFERENCES BETWEEN fd AND GNU Parallel
2443
2444 B<fd> does not support composed commands, so commands must be wrapped
2445 in B<sh -c>.
2446
2447 It buffers output in RAM.
2448
2449 It only takes file names from the filesystem as input (similar to B<find>).
2450
2451 https://github.com/sharkdp/fd
2452 (Last checked: 2019-02)
2453
2454
2455 =head2 DIFFERENCES BETWEEN lateral AND GNU Parallel
2456
2457 B<lateral> is very similar to B<sem>: It takes a single command and
2458 runs it in the background. The design means that output from parallel
2459 running jobs may mix. If it dies unexpectly it leaves a socket in
2460 ~/.lateral/socket.PID.
2461
2462 B<lateral> deals badly with too long command lines. This makes the
2463 B<lateral> server crash:
2464
2465   lateral run echo `seq 100000| head -c 1000k`
2466
2467 Any options will be read by B<lateral> so this does not work
2468 (B<lateral> interprets the B<-l>):
2469
2470   lateral run ls -l
2471
2472 Composed commands do not work:
2473
2474   lateral run pwd ';' ls
2475
2476 Functions do not work:
2477
2478   myfunc() { echo a; }
2479   export -f myfunc
2480   lateral run myfunc
2481
2482 Running B<emacs> in the terminal causes the parent shell to die:
2483
2484   echo '#!/bin/bash' > mycmd
2485   echo emacs -nw >> mycmd
2486   chmod +x mycmd
2487   lateral start
2488   lateral run ./mycmd
2489
2490 Here are the examples from https://github.com/akramer/lateral with the
2491 corresponding GNU B<sem> and GNU B<parallel> commands:
2492
2493   1$ lateral start
2494      for i in $(cat /tmp/names); do
2495        lateral run -- some_command $i
2496      done
2497      lateral wait
2498
2499   1$ for i in $(cat /tmp/names); do
2500        sem some_command $i
2501      done
2502      sem --wait
2503
2504   1$ parallel some_command :::: /tmp/names
2505
2506   2$ lateral start
2507      for i in $(seq 1 100); do
2508        lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2509      done
2510      lateral wait
2511
2512   2$ for i in $(seq 1 100); do
2513        sem my_slow_command < workfile$i > /tmp/logfile$i
2514      done
2515      sem --wait
2516
2517   2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2518        ::: {1..100}
2519
2520   3$ lateral start -p 0 # yup, it will just queue tasks
2521      for i in $(seq 1 100); do
2522        lateral run -- command_still_outputs_but_wont_spam inputfile$i
2523      done
2524      # command output spam can commence
2525      lateral config -p 10; lateral wait
2526
2527   3$ for i in $(seq 1 100); do
2528        echo "command inputfile$i" >> joblist
2529      done
2530      parallel -j 10 :::: joblist
2531
2532   3$ echo 1 > /tmp/njobs
2533      parallel -j /tmp/njobs command inputfile{} \
2534        ::: {1..100} &
2535      echo 10 >/tmp/njobs
2536      wait
2537
2538 https://github.com/akramer/lateral
2539 (Last checked: 2019-03)
2540
2541
2542 =head2 DIFFERENCES BETWEEN with-this AND GNU Parallel
2543
2544 The examples from https://github.com/amritb/with-this.git and the
2545 corresponding GNU B<parallel> command:
2546
2547   with -v "$(cat myurls.txt)" "curl -L this"
2548   parallel curl -L ::: myurls.txt
2549
2550   with -v "$(cat myregions.txt)" \
2551     "aws --region=this ec2 describe-instance-status"
2552   parallel aws --region={} ec2 describe-instance-status \
2553     :::: myregions.txt
2554
2555   with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2556   ls | parallel kubectl --kubeconfig={} get pods
2557
2558   with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2559   ls | grep config | parallel kubectl --kubeconfig={} get pods
2560
2561   with -v "$(echo {1..10})" "echo 123"
2562   parallel -N0 echo 123 ::: {1..10}
2563
2564 Stderr is merged with stdout. B<with-this> buffers in RAM. It uses 3x
2565 the output size, so you cannot have output larger than 1/3rd the
2566 amount of RAM. The input values cannot contain spaces. Composed
2567 commands do not work.
2568
2569 B<with-this> gives some additional information, so the output has to
2570 be cleaned before piping it to the next command.
2571
2572 https://github.com/amritb/with-this.git
2573 (Last checked: 2019-03)
2574
2575
2576 =head2 DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2577
2578 Summary (see legend above):
2579
2580 =over
2581
2582 =item - - - I4 - - I7
2583
2584 =item - - M3 - - M6
2585
2586 =item - O2 O3 - O5 O6 - x x
2587
2588 =item E1 - - - - - E7
2589
2590 =item - x x x x x x x x
2591
2592 =item - -
2593
2594 =back
2595
2596 =head3 EXAMPLES FROM Tollef's parallel MANUAL
2597
2598 B<Tollef> parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2599
2600 B<GNU> parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2601
2602 B<Tollef> parallel -j 3 ufraw -o processed -- *.NEF
2603
2604 B<GNU> parallel -j 3 ufraw -o processed ::: *.NEF
2605
2606 B<Tollef> parallel -j 3 -- ls df "echo hi"
2607
2608 B<GNU> parallel -j 3 ::: ls df "echo hi"
2609
2610 (Last checked: 2019-08)
2611
2612 =head2 DIFFERENCES BETWEEN rargs AND GNU Parallel
2613
2614 Summary (see legend above):
2615
2616 =over
2617
2618 =item I1 - - - - - I7
2619
2620 =item - - M3 M4 - -
2621
2622 =item - O2 O3 - O5 O6 - O8 -
2623
2624 =item E1 - - E4 - - -
2625
2626 =item - - - - - - - - -
2627
2628 =item - -
2629
2630 =back
2631
2632 B<rargs> has elegant ways of doing named regexp capture and field ranges.
2633
2634 With GNU B<parallel> you can use B<--rpl> to get a similar
2635 functionality as regexp capture gives, and use B<join> and B<@arg> to
2636 get the field ranges. But the syntax is longer. This:
2637
2638   --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2639
2640 would make it possible to use:
2641
2642   {1r3..6}
2643
2644 for field 3..6.
2645
2646 For full support of {n..m:s} including negative numbers use a dynamic
2647 replacement string like this:
2648
2649
2650   PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2651           $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2652           $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2653           $s = defined $$6 ? $$7 : " ";
2654           $_ = join $s,@arg[$a..$b]'\'
2655   export PARALLEL
2656
2657 You can then do:
2658
2659   head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2660     4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2661     ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2662
2663 =head3 EXAMPLES FROM rargs MANUAL
2664
2665   1$ ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2666
2667   1$ ls *.bak | parallel mv {} {.}
2668
2669   2$ cat download-list.csv |
2670        rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2671
2672   2$ cat download-list.csv |
2673        parallel --csv wget {1} -O {2}
2674   # or use regexps:
2675   2$ cat download-list.csv |
2676        parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' \
2677          wget {url} -O {filename}
2678
2679   3$ cat /etc/passwd |
2680        rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2681
2682   3$ cat /etc/passwd |
2683        parallel -q --colsep : \
2684          echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2685
2686 https://github.com/lotabout/rargs
2687 (Last checked: 2020-01)
2688
2689
2690 =head2 DIFFERENCES BETWEEN threader AND GNU Parallel
2691
2692 Summary (see legend above):
2693
2694 =over
2695
2696 =item I1 - - - - - -
2697
2698 =item M1 - M3 - - M6
2699
2700 =item O1 - O3 - O5 - - x x
2701
2702 =item E1 - - E4 - - -
2703
2704 =item - - - - - - - - -
2705
2706 =item - -
2707
2708 =back
2709
2710 Newline separates arguments, but newline at the end of file is treated
2711 as an empty argument. So this runs 2 jobs:
2712
2713   echo two_jobs | threader -run 'echo "$THREADID"'
2714
2715 B<threader> ignores stderr, so any output to stderr is
2716 lost. B<threader> buffers in RAM, so output bigger than the machine's
2717 virtual memory will cause the machine to crash.
2718
2719 https://github.com/voodooEntity/threader
2720 (Last checked: 2020-04)
2721
2722
2723 =head2 DIFFERENCES BETWEEN runp AND GNU Parallel
2724
2725 Summary (see legend above):
2726
2727 =over
2728
2729 =item I1 I2 - - - - -
2730
2731 =item M1 - (M3) - - M6
2732
2733 =item O1 O2 O3 - O5 O6 - x x -
2734
2735 =item E1 - - - - - -
2736
2737 =item - - - - - - - - -
2738
2739 =item - -
2740
2741 =back
2742
2743 (M3): You can add a prefix and a postfix to the input, so it means you can
2744 only insert the argument on the command line once.
2745
2746 B<runp> runs 10 jobs in parallel by default.  B<runp> blocks if output
2747 of a command is > 64 Kbytes.  Quoting of input is needed.  It adds
2748 output to stderr (this can be prevented with -q)
2749
2750 =head3 Examples as GNU Parallel
2751
2752   base='https://images-api.nasa.gov/search'
2753   query='jupiter'
2754   desc='planet'
2755   type='image'
2756   url="$base?q=$query&description=$desc&media_type=$type"
2757
2758   # Download the images in parallel using runp
2759   curl -s $url | jq -r .collection.items[].href | \
2760     runp -p 'curl -s' | jq -r .[] | grep large | \
2761     runp -p 'curl -s -L -O'
2762
2763   time curl -s $url | jq -r .collection.items[].href | \
2764     runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2765     runp -g 1 -q -p 'curl -s -L -O'
2766
2767   # Download the images in parallel
2768   curl -s $url | jq -r .collection.items[].href | \
2769     parallel curl -s | jq -r .[] | grep large | \
2770     parallel curl -s -L -O
2771
2772   time curl -s $url | jq -r .collection.items[].href | \
2773     parallel -j 1 curl -s | jq -r .[] | grep large | \
2774     parallel -j 1 curl -s -L -O
2775
2776
2777 =head4 Run some test commands (read from file)
2778
2779   # Create a file containing commands to run in parallel.
2780   cat << EOF > /tmp/test-commands.txt
2781   sleep 5
2782   sleep 3
2783   blah     # this will fail
2784   ls $PWD  # PWD shell variable is used here
2785   EOF
2786
2787   # Run commands from the file.
2788   runp /tmp/test-commands.txt > /dev/null
2789
2790   parallel -a /tmp/test-commands.txt > /dev/null
2791
2792 =head4 Ping several hosts and see packet loss (read from stdin)
2793
2794   # First copy this line and press Enter
2795   runp -p 'ping -c 5 -W 2' -s '| grep loss'
2796   localhost
2797   1.1.1.1
2798   8.8.8.8
2799   # Press Enter and Ctrl-D when done entering the hosts
2800
2801   # First copy this line and press Enter
2802   parallel ping -c 5 -W 2 {} '| grep loss'
2803   localhost
2804   1.1.1.1
2805   8.8.8.8
2806   # Press Enter and Ctrl-D when done entering the hosts
2807
2808 =head4 Get directories' sizes (read from stdin)
2809
2810   echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2811
2812   echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2813   # or:
2814   parallel sudo du -sh ::: "$HOME" /etc /tmp
2815
2816 =head4 Compress files
2817
2818   find . -iname '*.txt' | runp -p 'gzip --best'
2819
2820   find . -iname '*.txt' | parallel gzip --best
2821
2822 =head4 Measure HTTP request + response time
2823
2824   export CURL="curl -w 'time_total:  %{time_total}\n'"
2825   CURL="$CURL -o /dev/null -s https://golang.org/"
2826   perl -wE 'for (1..10) { say $ENV{CURL} }' |
2827      runp -q  # Make 10 requests
2828
2829   perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2830   # or:
2831   parallel -N0 "$CURL" ::: {1..10}
2832
2833 =head4 Find open TCP ports
2834
2835   cat << EOF > /tmp/host-port.txt
2836   localhost 22
2837   localhost 80
2838   localhost 81
2839   127.0.0.1 443
2840   127.0.0.1 444
2841   scanme.nmap.org 22
2842   scanme.nmap.org 23
2843   scanme.nmap.org 443
2844   EOF
2845
2846   1$ cat /tmp/host-port.txt |
2847        runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2848
2849   # --colsep is needed to split the line
2850   1$ cat /tmp/host-port.txt |
2851        parallel --colsep ' ' netcat -v -w2 -z 2>&1 |
2852        egrep '(succeeded!|open)$'
2853   # or use uq for unquoted:
2854   1$ cat /tmp/host-port.txt |
2855        parallel netcat -v -w2 -z {=uq=} 2>&1 |
2856        egrep '(succeeded!|open)$'
2857
2858 https://github.com/jreisinger/runp
2859 (Last checked: 2020-04)
2860
2861
2862 =head2 DIFFERENCES BETWEEN papply AND GNU Parallel
2863
2864 Summary (see legend above):
2865
2866 =over
2867
2868 =item - - - I4 - - -
2869
2870 =item M1 - M3 - - M6
2871
2872 =item - - O3 - O5 - - x x O10
2873
2874 =item E1 - - E4 - - -
2875
2876 =item - - - - - - - - -
2877
2878 =item - -
2879
2880 =back
2881
2882 B<papply> does not print the output if the command fails:
2883
2884   $ papply 'echo %F; false' foo
2885   "echo foo; false" did not succeed
2886
2887 B<papply>'s replacement strings (%F %d %f %n %e %z) can be simulated in GNU
2888 B<parallel> by putting this in B<~/.parallel/config>:
2889
2890   --rpl '%F'
2891   --rpl '%d $_=Q(::dirname($_));'
2892   --rpl '%f s:.*/::;'
2893   --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2894   --rpl '%e s:.*\.:.:'
2895   --rpl '%z $_=""'
2896
2897 B<papply> buffers in RAM, and uses twice the amount of output. So
2898 output of 5 GB takes 10 GB RAM.
2899
2900 The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2901 seconds (compared to 10 seconds with GNU B<parallel>).
2902
2903
2904 =head3 Examples as GNU Parallel
2905
2906   1$ papply gzip *.txt
2907
2908   1$ parallel gzip ::: *.txt
2909
2910   2$ papply "convert %F %n.jpg" *.png
2911
2912   2$ parallel convert {} {.}.jpg ::: *.png
2913
2914
2915 https://pypi.org/project/papply/
2916 (Last checked: 2020-04)
2917
2918
2919 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
2920
2921 Summary (see legend above):
2922
2923 =over
2924
2925 =item - - - I4 - - I7
2926
2927 =item - - - - - M6
2928
2929 =item - O2 O3 - O5 O6 - x x O10
2930
2931 =item E1 - - E4 - E6 -
2932
2933 =item - - - - - - - - -
2934
2935 =item S1 S2
2936
2937 =back
2938
2939 B<async> is very similary to GNU B<parallel>'s B<--semaphore> mode
2940 (aka B<sem>). B<async> requires the user to start a server process.
2941
2942 The input is quoted like B<-q> so you need B<bash -c "...;..."> to run
2943 composed commands.
2944
2945 =head3 Examples as GNU Parallel
2946
2947   1$ S="/tmp/example_socket"
2948
2949   1$ ID=myid
2950
2951   2$ async -s="$S" server --start
2952
2953   2$ # GNU Parallel does not need a server to run
2954
2955   3$ for i in {1..20}; do
2956          # prints command output to stdout
2957          async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2958      done
2959
2960   3$ for i in {1..20}; do
2961          # prints command output to stdout
2962          sem --id "$ID" -j100% "sleep 1 && echo test $i"
2963          # GNU Parallel will only print job when it is done
2964          # If you need output from different jobs to mix
2965          # use -u or --line-buffer
2966          sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2967      done
2968
2969   4$ # wait until all commands are finished
2970      async -s="$S" wait
2971
2972   4$ sem --id "$ID" --wait
2973
2974   5$ # configure the server to run four commands in parallel
2975      async -s="$S" server -j4
2976
2977   5$ export PARALLEL=-j4
2978
2979   6$ mkdir "/tmp/ex_dir"
2980      for i in {21..40}; do
2981        # redirects command output to /tmp/ex_dir/file*
2982        async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2983          bash -c "sleep 1 && echo test $i"
2984      done
2985
2986   6$ mkdir "/tmp/ex_dir"
2987      for i in {21..40}; do
2988        # redirects command output to /tmp/ex_dir/file*
2989        sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
2990          "sleep 1 && echo test $i"
2991      done
2992
2993   7$ sem --id "$ID" --wait
2994
2995   7$ async -s="$S" wait
2996
2997   8$ # stops server
2998      async -s="$S" server --stop
2999
3000   8$ # GNU Parallel does not need to stop a server
3001
3002
3003 https://github.com/ctbur/async/
3004 (Last checked: 2023-01)
3005
3006
3007 =head2 DIFFERENCES BETWEEN pardi AND GNU Parallel
3008
3009 Summary (see legend above):
3010
3011 =over
3012
3013 =item I1 I2 - - - - I7
3014
3015 =item M1 - - - - M6
3016
3017 =item O1 O2 O3 O4 O5 - O7 - - O10
3018
3019 =item E1 - - E4 - - -
3020
3021 =item - - - - - - - - -
3022
3023 =item - -
3024
3025 =back
3026
3027 B<pardi> is very similar to B<parallel --pipe --cat>: It reads blocks
3028 of data and not arguments. So it cannot insert an argument in the
3029 command line. It puts the block into a temporary file, and this file
3030 name (%IN) can be put in the command line. You can only use %IN once.
3031
3032 It can also run full command lines in parallel (like: B<cat file |
3033 parallel>).
3034
3035 =head3 EXAMPLES FROM pardi test.sh
3036
3037   1$ time pardi -v -c 100 -i data/decoys.smi -ie .smi -oe .smi \
3038        -o data/decoys_std_pardi.smi \
3039           -w '(standardiser -i %IN -o %OUT 2>&1) > /dev/null'
3040
3041   1$ cat data/decoys.smi |
3042        time parallel -N 100 --pipe --cat \
3043          '(standardiser -i {} -o {#} 2>&1) > /dev/null; cat {#}; rm {#}' \
3044          > data/decoys_std_pardi.smi
3045
3046   2$ pardi -n 1 -i data/test_in.types -o data/test_out.types \
3047              -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3048
3049   2$ cat data/test_in.types |
3050        parallel -n 1 -k --pipe --cat --regexp --recstart '^#atoms' \
3051          'cat {}' > data/test_out.types
3052
3053   3$ pardi -c 6 -i data/test_in.types -o data/test_out.types \
3054              -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3055
3056   3$ cat data/test_in.types |
3057        parallel -n 6 -k --pipe --cat --regexp --recstart '^#atoms' \
3058          'cat {}' > data/test_out.types
3059
3060   4$ pardi -i data/decoys.mol2 -o data/still_decoys.mol2 \
3061              -d 's:@<TRIPOS>MOLECULE' -w 'cp %IN %OUT'
3062
3063   4$ cat data/decoys.mol2 |
3064        parallel -n 1 --pipe --cat --recstart '@<TRIPOS>MOLECULE' \
3065          'cp {} {#}; cat {#}; rm {#}' > data/still_decoys.mol2
3066
3067   5$ pardi -i data/decoys.mol2 -o data/decoys2.mol2 \
3068              -d b:10000 -w 'cp %IN %OUT' --preserve
3069
3070   5$ cat data/decoys.mol2 |
3071        parallel -k --pipe --block 10k --recend '' --cat \
3072          'cat {} > {#}; cat {#}; rm {#}' > data/decoys2.mol2
3073
3074 https://github.com/UnixJunkie/pardi
3075 (Last checked: 2021-01)
3076
3077
3078 =head2 DIFFERENCES BETWEEN bthread AND GNU Parallel
3079
3080 Summary (see legend above):
3081
3082 =over
3083
3084 =item - - - I4 -  - -
3085
3086 =item - - - - - M6
3087
3088 =item O1 - O3 - - - O7 O8 - -
3089
3090 =item E1 - - - - - -
3091
3092 =item - - - - - - - - -
3093
3094 =item - -
3095
3096 =back
3097
3098 B<bthread> takes around 1 sec per MB of output. The maximal output
3099 line length is 1073741759.
3100
3101 You cannot quote space in the command, so you cannot run composed
3102 commands like B<sh -c "echo a; echo b">.
3103
3104 https://gitlab.com/netikras/bthread
3105 (Last checked: 2021-01)
3106
3107
3108 =head2 DIFFERENCES BETWEEN simple_gpu_scheduler AND GNU Parallel
3109
3110 Summary (see legend above):
3111
3112 =over
3113
3114 =item I1 - - - - - I7
3115
3116 =item M1 - - - - M6
3117
3118 =item - O2 O3 - - O6 - x x O10
3119
3120 =item E1 - - - - - -
3121
3122 =item - - - - - - - - -
3123
3124 =item - -
3125
3126 =back
3127
3128 =head3 EXAMPLES FROM simple_gpu_scheduler MANUAL
3129
3130   1$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
3131
3132   1$ parallel -j3 --shuf \
3133      CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' \
3134        < gpu_commands.txt
3135
3136   2$ simple_hypersearch \
3137        "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3138        -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3139        simple_gpu_scheduler --gpus 0,1,2
3140
3141   2$ parallel --header : --shuf -j3 -v \
3142        CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' \
3143        python3 train_dnn.py --lr {lr} --batch_size {bs} \
3144        ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3145
3146   3$ simple_hypersearch \
3147        "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3148        --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3149        simple_gpu_scheduler --gpus 0,1,2
3150
3151   3$ parallel --header : --shuf \
3152        CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq()>5 and skip() =}' \
3153        python3 train_dnn.py --lr {lr} --batch_size {bs} \
3154        ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3155
3156   4$ touch gpu.queue
3157      tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
3158      echo "my_command_with | and stuff > logfile" >> gpu.queue
3159
3160   4$ touch gpu.queue
3161      tail -f -n 0 gpu.queue |
3162        parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
3163      # Needed to fill job slots once
3164      seq 3 | parallel echo true >> gpu.queue
3165      # Add jobs
3166      echo "my_command_with | and stuff > logfile" >> gpu.queue
3167      # Needed to flush output from completed jobs
3168      seq 3 | parallel echo true >> gpu.queue
3169
3170 https://github.com/ExpectationMax/simple_gpu_scheduler
3171 (Last checked: 2021-01)
3172
3173
3174 =head2 DIFFERENCES BETWEEN parasweep AND GNU Parallel
3175
3176 B<parasweep> is a Python module for facilitating parallel parameter
3177 sweeps.
3178
3179 A B<parasweep> job will normally take a text file as input. The text
3180 file contains arguments for the job. Some of these arguments will be
3181 fixed and some of them will be changed by B<parasweep>.
3182
3183 It does this by having a template file such as template.txt:
3184
3185   Xval: {x}
3186   Yval: {y}
3187   FixedValue: 9
3188   # x with 2 decimals
3189   DecimalX: {x:.2f}
3190   TenX: ${x*10}
3191   RandomVal: {r}
3192
3193 and from this template it generates the file to be used by the job by
3194 replacing the replacement strings.
3195
3196 Being a Python module B<parasweep> integrates tighter with Python than
3197 GNU B<parallel>. You get the parameters directly in a Python data
3198 structure. With GNU B<parallel> you can use the JSON or CSV output
3199 format to get something similar, but you would have to read the
3200 output.
3201
3202 B<parasweep> has a filtering method to ignore parameter combinations
3203 you do not need.
3204
3205 Instead of calling the jobs directly, B<parasweep> can use Python's
3206 Distributed Resource Management Application API to make jobs run with
3207 different cluster software.
3208
3209
3210 GNU B<parallel> B<--tmpl> supports templates with replacement
3211 strings. Such as:
3212
3213   Xval: {x}
3214   Yval: {y}
3215   FixedValue: 9
3216   # x with 2 decimals
3217   DecimalX: {=x $_=sprintf("%.2f",$_) =}
3218   TenX: {=x $_=$_*10 =}
3219   RandomVal: {=1 $_=rand() =}
3220
3221 that can be used like:
3222
3223   parallel --header : --tmpl my.tmpl={#}.t myprog {#}.t \
3224     ::: x 1 2 3 ::: y 1 2 3
3225
3226 Filtering is supported as:
3227
3228   parallel --filter '{1} > {2}' echo ::: 1 2 3 ::: 1 2 3
3229
3230 https://github.com/eviatarbach/parasweep
3231 (Last checked: 2021-01)
3232
3233
3234 =head2 DIFFERENCES BETWEEN parallel-bash AND GNU Parallel
3235
3236 Summary (see legend above):
3237
3238 =over
3239
3240 =item I1 I2 - - - - -
3241
3242 =item - - M3 - - M6
3243
3244 =item - O2 O3 - O5 O6 - O8 x O10
3245
3246 =item E1 - - - - - -
3247
3248 =item - - - - - - - - -
3249
3250 =item - -
3251
3252 =back
3253
3254 B<parallel-bash> is written in pure bash. It is really fast (overhead
3255 of ~0.05 ms/job compared to GNU B<parallel>'s 3-10 ms/job). So if your
3256 jobs are extremely short lived, and you can live with the quite
3257 limited command, this may be useful.
3258
3259 It works by making a queue for each process. Then the jobs are
3260 distributed to the queues in a round robin fashion. Finally the queues
3261 are started in parallel. This works fine, if you are lucky, but if
3262 not, all the long jobs may end up in the same queue, so you may see:
3263
3264   $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3265       time parallel -P4 sleep {}
3266   (7 seconds)
3267   $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3268       time ./parallel-bash.bash -p 4 -c sleep {}
3269   (12 seconds)
3270
3271 Because it uses bash lists, the total number of jobs is limited to
3272 167000..265000 depending on your environment. You get a segmentation
3273 fault, when you reach the limit.
3274
3275 Ctrl-C does not stop spawning new jobs. Ctrl-Z does not suspend
3276 running jobs.
3277
3278
3279 =head3 EXAMPLES FROM parallel-bash
3280
3281   1$ some_input | parallel-bash -p 5 -c echo
3282
3283   1$ some_input | parallel -j 5 echo
3284
3285   2$ parallel-bash -p 5 -c echo < some_file
3286
3287   2$ parallel -j 5 echo < some_file
3288
3289   3$ parallel-bash -p 5 -c echo <<< 'some string'
3290
3291   3$ parallel -j 5 -c echo <<< 'some string'
3292
3293   4$ something | parallel-bash -p 5 -c echo {} {}
3294
3295   4$ something | parallel -j 5 echo {} {}
3296
3297 https://reposhub.com/python/command-line-tools/Akianonymus-parallel-bash.html
3298 (Last checked: 2021-06)
3299
3300
3301 =head2 DIFFERENCES BETWEEN bash-concurrent AND GNU Parallel
3302
3303 B<bash-concurrent> is more an alternative to B<make> than to GNU
3304 B<parallel>. Its input is very similar to a Makefile, where jobs
3305 depend on other jobs.
3306
3307 It has a nice progress indicator where you can see which jobs
3308 completed successfully, which jobs are currently running, which jobs
3309 failed, and which jobs were skipped due to a depending job failed.
3310 The indicator does not deal well with resizing the window.
3311
3312 Output is cached in tempfiles on disk, but is only shown if there is
3313 an error, so it is not meant to be part of a UNIX pipeline. If
3314 B<bash-concurrent> crashes these tempfiles are not removed.
3315
3316 It uses an O(n*n) algorithm, so if you have 1000 independent jobs it
3317 takes 22 seconds to start it.
3318
3319 https://github.com/themattrix/bash-concurrent
3320 (Last checked: 2021-02)
3321
3322
3323 =head2 DIFFERENCES BETWEEN spawntool AND GNU Parallel
3324
3325 Summary (see legend above):
3326
3327 =over
3328
3329 =item I1 - - - - - -
3330
3331 =item M1 - - - - M6
3332
3333 =item - O2 O3 - O5 O6 - x x O10
3334
3335 =item E1 - - - - - -
3336
3337 =item - - - - - - - - -
3338
3339 =item - -
3340
3341 =back
3342
3343 B<spawn> reads a full command line from stdin which it executes in
3344 parallel.
3345
3346
3347 http://code.google.com/p/spawntool/
3348 (Last checked: 2021-07)
3349
3350
3351 =head2 DIFFERENCES BETWEEN go-pssh AND GNU Parallel
3352
3353 Summary (see legend above):
3354
3355 =over
3356
3357 =item - - - - - - -
3358
3359 =item M1 - - - - -
3360
3361 =item O1 - - - - - - x x O10
3362
3363 =item E1 - - - - - -
3364
3365 =item R1 R2 - - - R6 - - -
3366
3367 =item - -
3368
3369 =back
3370
3371 B<go-pssh> does B<ssh> in parallel to multiple machines. It runs the
3372 same command on multiple machines similar to B<--nonall>.
3373
3374 The hostnames must be given as IP-addresses (not as hostnames).
3375
3376 Output is sent to stdout (standard output) if command is successful,
3377 and to stderr (standard error) if the command fails.
3378
3379 =head3 EXAMPLES FROM go-pssh
3380
3381   1$ go-pssh -l <ip>,<ip> -u <user> -p <port> -P <passwd> -c "<command>"
3382
3383   1$ parallel -S 'sshpass -p <passwd> ssh -p <port> <user>@<ip>' \
3384        --nonall "<command>"
3385
3386   2$ go-pssh scp -f host.txt -u <user> -p <port> -P <password> \
3387        -s /local/file_or_directory -d /remote/directory
3388
3389   2$ parallel --nonall --slf host.txt \
3390        --basefile /local/file_or_directory/./ --wd /remote/directory
3391        --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3392
3393   3$ go-pssh scp -l <ip>,<ip> -u <user> -p <port> -P <password> \
3394        -s /local/file_or_directory -d /remote/directory
3395
3396   3$ parallel --nonall -S <ip>,<ip> \
3397        --basefile /local/file_or_directory/./ --wd /remote/directory
3398        --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3399
3400 https://github.com/xuchenCN/go-pssh
3401 (Last checked: 2021-07)
3402
3403
3404 =head2 DIFFERENCES BETWEEN go-parallel AND GNU Parallel
3405
3406 Summary (see legend above):
3407
3408 =over
3409
3410 =item I1 I2 - - - - I7
3411
3412 =item - - M3 - - M6
3413
3414 =item - O2 O3 - O5 - - x x - O10
3415
3416 =item E1 - - E4 - - -
3417
3418 =item - - - - - - - - -
3419
3420 =item - -
3421
3422 =back
3423
3424 B<go-parallel> uses Go templates for replacement strings. Quite
3425 similar to the I<{= perl expr =}> replacement string.
3426
3427 =head3 EXAMPLES FROM go-parallel
3428
3429   1$ go-parallel -a ./files.txt -t 'cp {{.Input}} {{.Input | dirname | dirname}}'
3430
3431   1$ parallel -a ./files.txt cp {} '{= $_=::dirname(::dirname($_)) =}'
3432
3433   2$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{noExt .Input}}'
3434
3435   2$ parallel -a ./files.txt echo mkdir -p {} {.}
3436
3437   3$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{.Input | basename | noExt}}'
3438
3439   3$ parallel -a ./files.txt echo mkdir -p {} {/.}
3440
3441 https://github.com/mylanconnolly/parallel
3442 (Last checked: 2021-07)
3443
3444
3445 =head2 DIFFERENCES BETWEEN p AND GNU Parallel
3446
3447 Summary (see legend above):
3448
3449 =over
3450
3451 =item - - - I4 - - x
3452
3453 =item - - - - - M6
3454
3455 =item - O2 O3 - O5 O6 - x x - O10
3456
3457 =item E1 - - - - - -
3458
3459 =item - - - - - - - - -
3460
3461 =item - -
3462
3463 =back
3464
3465 B<p> is a tiny shell script. It can color output with some predefined
3466 colors, but is otherwise quite limited.
3467
3468 It maxes out at around 116000 jobs (probably due to limitations in Bash).
3469
3470 =head3 EXAMPLES FROM p
3471
3472 Some of the examples from B<p> cannot be implemented 100% by GNU
3473 B<parallel>: The coloring is a bit different, and GNU B<parallel>
3474 cannot have B<--tag> for some inputs and not for others.
3475
3476 The coloring done by GNU B<parallel> is not exactly the same as B<p>.
3477
3478   1$ p -bc blue "ping 127.0.0.1" -uc red "ping 192.168.0.1" \
3479      -rc yellow "ping 192.168.1.1" -t example "ping example.com"
3480
3481   1$ parallel --lb -j0 --color --tag ping \
3482      ::: 127.0.0.1 192.168.0.1 192.168.1.1 example.com
3483
3484   2$ p "tail -f /var/log/httpd/access_log" \
3485      -bc red "tail -f /var/log/httpd/error_log"
3486
3487   2$ cd /var/log/httpd;
3488      parallel --lb --color --tag tail -f ::: access_log error_log
3489
3490   3$ p tail -f "some file" \& p tail -f "other file with space.txt"
3491
3492   3$ parallel --lb tail -f ::: 'some file' "other file with space.txt"
3493
3494   4$ p -t project1 "hg pull project1" -t project2 \
3495      "hg pull project2" -t project3 "hg pull project3"
3496
3497   4$ parallel --lb hg pull ::: project{1..3}
3498
3499 https://github.com/rudymatela/evenmoreutils/blob/master/man/p.1.adoc
3500 (Last checked: 2022-04)
3501
3502
3503 =head2 DIFFERENCES BETWEEN senechal AND GNU Parallel
3504
3505 Summary (see legend above):
3506
3507 =over
3508
3509 =item I1 - - - - - -
3510
3511 =item M1 - M3 - - M6
3512
3513 =item O1 - O3 O4 - - - x x -
3514
3515 =item E1 - - - - - -
3516
3517 =item - - - - - - - - -
3518
3519 =item - -
3520
3521 =back
3522
3523 B<seneschal> only starts the first job after reading the last job, and
3524 output from the first job is only printed after the last job finishes.
3525
3526 1 byte of output requites 3.5 bytes of RAM.
3527
3528 This makes it impossible to have a total output bigger than the
3529 virtual memory.
3530
3531 Even though output is kept in RAM outputing is quite slow: 30 MB/s.
3532
3533 Output larger than 4 GB causes random problems - it looks like a race
3534 condition.
3535
3536 This:
3537
3538   echo 1 | seneschal  --prefix='yes `seq 1000`|head -c 1G' >/dev/null
3539
3540 takes 4100(!) CPU seconds to run on a 64C64T server, but only 140 CPU
3541 seconds on a 4C8T laptop. So it looks like B<seneschal> wastes a lot
3542 of CPU time coordinating the CPUs.
3543
3544 Compare this to:
3545
3546   echo 1 | time -v parallel -N0 'yes `seq 1000`|head -c 1G' >/dev/null
3547
3548 which takes 3-8 CPU seconds.
3549
3550 =head3 EXAMPLES FROM seneschal README.md
3551
3552   1$ echo $REPOS | seneschal --prefix="cd {} && git pull"
3553
3554   # If $REPOS is newline separated
3555   1$ echo "$REPOS" | parallel -k "cd {} && git pull"
3556   # If $REPOS is space separated
3557   1$ echo -n "$REPOS" | parallel -d' ' -k "cd {} && git pull"
3558
3559   COMMANDS="pwd
3560   sleep 5 && echo boom
3561   echo Howdy
3562   whoami"
3563
3564   2$ echo "$COMMANDS" | seneschal --debug
3565
3566   2$ echo "$COMMANDS" | parallel -k -v
3567
3568   3$ ls -1 | seneschal --prefix="pushd {}; git pull; popd;"
3569
3570   3$ ls -1 | parallel -k "pushd {}; git pull; popd;"
3571   # Or if current dir also contains files:
3572   3$ parallel -k "pushd {}; git pull; popd;" ::: */
3573
3574 https://github.com/TheWizardTower/seneschal
3575 (Last checked: 2022-06)
3576
3577
3578 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
3579
3580 Summary (see legend above):
3581
3582 =over
3583
3584 =item x x x x x x x
3585
3586 =item - x x x x x
3587
3588 =item x O2 O3 O4 O5 O6 - x x O10
3589
3590 =item E1 - - E4 - - -
3591
3592 =item - - - - - - - - -
3593
3594 =item S1 S2
3595
3596 =back
3597
3598 B<async> works like B<sem>.
3599
3600
3601 =head3 EXAMPLES FROM async
3602
3603   1$ S="/tmp/example_socket"
3604
3605      async -s="$S" server --start
3606
3607      for i in {1..20}; do
3608          # prints command output to stdout
3609          async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
3610      done
3611
3612      # wait until all commands are finished
3613      async -s="$S" wait
3614
3615   1$ S="example_id"
3616
3617      # server not needed
3618
3619      for i in {1..20}; do
3620          # prints command output to stdout
3621          sem --bg --id "$S" -j100% "sleep 1 && echo test $i"
3622      done
3623
3624      # wait until all commands are finished
3625      sem --fg --id "$S" --wait
3626
3627   2$ # configure the server to run four commands in parallel
3628      async -s="$S" server -j4
3629
3630      mkdir "/tmp/ex_dir"
3631      for i in {21..40}; do
3632          # redirects command output to /tmp/ex_dir/file*
3633          async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
3634            bash -c "sleep 1 && echo test $i"
3635      done
3636
3637      async -s="$S" wait
3638
3639      # stops server
3640      async -s="$S" server --stop
3641
3642   2$ # starting server not needed
3643
3644      mkdir "/tmp/ex_dir"
3645      for i in {21..40}; do
3646          # redirects command output to /tmp/ex_dir/file*
3647          sem --bg --id "$S" --results "/tmp/ex_dir/file$i{}" \
3648            "sleep 1 && echo test $i"
3649      done
3650
3651      sem --fg --id "$S" --wait
3652
3653      # there is no server to stop
3654
3655 https://github.com/ctbur/async
3656 (Last checked: 2023-01)
3657
3658
3659 =head2 DIFFERENCES BETWEEN tandem AND GNU Parallel
3660
3661 Summary (see legend above):
3662
3663 =over
3664
3665 =item - - - I4 - - x
3666
3667 =item M1 - - - - M6
3668
3669 =item - - O3 - - - - x - -
3670
3671 =item E1 - E3 - E5 - -
3672
3673 =item - - - - - - - - -
3674
3675 =item - -
3676
3677 =back
3678
3679 B<tandem> runs full commands in parallel. It is made for starting a
3680 "server", running a job against the server, and when the job is done,
3681 the server is killed.
3682
3683 More generally: it kills all jobs when the first job completes -
3684 similar to '--halt now,done=1'.
3685
3686 B<tandem> silently discards some output. It is unclear exactly when
3687 this happens. It looks like a race condition, because it varies for
3688 each run.
3689
3690   $ tandem "seq 10000" | wc -l
3691   6731 <- This should always be 10002
3692
3693
3694 =head3 EXAMPLES FROM Demo
3695
3696   tandem \
3697     'php -S localhost:8000' \
3698     'esbuild src/*.ts --bundle --outdir=dist --watch' \
3699     'tailwind -i src/index.css -o dist/index.css --watch'
3700
3701   # Emulate tandem's behaviour
3702   PARALLEL='--color --lb  --halt now,done=1 --tagstring '
3703   PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3704   export PARALLEL
3705
3706   parallel ::: \
3707     'php -S localhost:8000' \
3708     'esbuild src/*.ts --bundle --outdir=dist --watch' \
3709     'tailwind -i src/index.css -o dist/index.css --watch'
3710
3711
3712 =head3 EXAMPLES FROM tandem -h
3713
3714   # Emulate tandem's behaviour
3715   PARALLEL='--color --lb  --halt now,done=1 --tagstring '
3716   PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3717   export PARALLEL
3718
3719   1$ tandem 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3720
3721   1$ parallel ::: 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3722
3723   # '-t 0' fails. But '--timeout 0 works'
3724   2$ tandem --timeout 0 'sleep 5 && echo "hello"' \
3725        'sleep 2 && echo "world"'
3726
3727   2$ parallel --timeout 0 ::: 'sleep 5 && echo "hello"' \
3728        'sleep 2 && echo "world"'
3729
3730 =head3 EXAMPLES FROM tandem's readme.md
3731
3732   # Emulate tandem's behaviour
3733   PARALLEL='--color --lb  --halt now,done=1 --tagstring '
3734   PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3735   export PARALLEL
3736
3737   1$ tandem 'next dev' 'nodemon --quiet ./server.js'
3738
3739   1$ parallel ::: 'next dev' 'nodemon --quiet ./server.js'
3740
3741   2$ cat package.json
3742      {
3743        "scripts": {
3744          "dev:php": "...",
3745          "dev:js": "...",
3746          "dev:css": "..."
3747        }
3748      }
3749
3750      tandem 'npm:dev:php' 'npm:dev:js' 'npm:dev:css'
3751
3752   # GNU Parallel uses bash functions instead
3753   2$ cat package.sh
3754      dev:php() { ... ; }
3755      dev:js() { ... ; }
3756      dev:css() { ... ; }
3757      export -f dev:php dev:js dev:css
3758
3759      . package.sh
3760      parallel ::: dev:php dev:js dev:css
3761
3762   3$ tandem 'npm:dev:*'
3763
3764   3$ compgen -A function | grep ^dev: | parallel
3765
3766 For usage in Makefiles, include a copy of GNU Parallel with your
3767 source using `parallel --embed`. This has the added benefit of also
3768 working if access to the internet is down or restricted.
3769
3770 https://github.com/rosszurowski/tandem
3771 (Last checked: 2023-01)
3772
3773
3774 =head2 DIFFERENCES BETWEEN rust-parallel(aaronriekenberg) AND GNU Parallel
3775
3776 Summary (see legend above):
3777
3778 =over
3779
3780 =item I1 I2 I3 - - - -
3781
3782 =item - - - - - M6
3783
3784 =item O1 O2 O3 - O5 O6 - x - O10
3785
3786 =item E1 - - E4 - - -
3787
3788 =item - - - - - - - - -
3789
3790 =item - -
3791
3792 =back
3793
3794 B<rust-parallel> has a goal of only using Rust. It seems it is
3795 impossible to call bash functions from the command line. You would
3796 need to put these in a script.
3797
3798 Calling a script that misses the shebang line (#! as first line)
3799 fails.
3800
3801 =head3 EXAMPLES FROM rust-parallel's README.md
3802
3803   $ cat >./test <<EOL
3804   echo hi
3805   echo there
3806   echo how
3807   echo are
3808   echo you
3809   EOL
3810
3811   1$ cat test | rust-parallel -j5
3812
3813   1$ cat test | parallel -j5
3814
3815   2$ cat test | rust-parallel -j1
3816
3817   2$ cat test | parallel -j1
3818
3819   3$ head -100 /usr/share/dict/words | rust-parallel md5 -s
3820
3821   3$ head -100 /usr/share/dict/words | parallel md5 -s
3822
3823   4$ find . -type f -print0 | rust-parallel -0 gzip -f -k
3824
3825   4$ find . -type f -print0 | parallel -0 gzip -f -k
3826
3827   5$ head -100 /usr/share/dict/words |
3828        awk '{printf "md5 -s %s\n", $1}' | rust-parallel
3829
3830   5$ head -100 /usr/share/dict/words |
3831        awk '{printf "md5 -s %s\n", $1}' | parallel
3832
3833   6$ head -100 /usr/share/dict/words | rust-parallel md5 -s |
3834        grep -i abba
3835
3836   6$ head -100 /usr/share/dict/words | parallel md5 -s |
3837        grep -i abba
3838
3839 https://github.com/aaronriekenberg/rust-parallel
3840 (Last checked: 2023-01)
3841
3842
3843 =head2 DIFFERENCES BETWEEN parallelium AND GNU Parallel
3844
3845 Summary (see legend above):
3846
3847 =over
3848
3849 =item - I2 - - - - -
3850
3851 =item M1 - - - - M6
3852
3853 =item O1 - O3 - - - - x - -
3854
3855 =item E1 - - E4 - - -
3856
3857 =item - - - - - - - - -
3858
3859 =item - -
3860
3861 =back
3862
3863 B<parallelium> merges standard output (stdout) and standard error
3864 (stderr). The maximal output of a command is 8192 bytes. Bigger output
3865 makes B<parallelium> go into an infinite loop.
3866
3867 In the input file for B<parallelium> you can define a tag, so that you
3868 can select to run only these commands. A bit like a target in a
3869 Makefile.
3870
3871 Progress is printed on standard output (stdout) prepended with '#'
3872 with similar information as GNU B<parallel>'s B<--bar>.
3873
3874 =head3 EXAMPLES
3875
3876     $ cat testjobs.txt
3877     #tag common sleeps classA
3878     (sleep 4.495;echo "job 000")
3879     :
3880     (sleep 2.587;echo "job 016")
3881
3882     #tag common sleeps classB
3883     (sleep 0.218;echo "job 017")
3884     :
3885     (sleep 2.269;echo "job 040")
3886
3887     #tag common sleeps classC
3888     (sleep 2.586;echo "job 041")
3889     :
3890     (sleep 1.626;echo "job 099")
3891
3892     #tag lasthalf, sleeps, classB
3893     (sleep 1.540;echo "job 100")
3894     :
3895     (sleep 2.001;echo "job 199")
3896
3897     1$ parallelium -f testjobs.txt -l logdir -t classB,classC
3898
3899     1$ cat testjobs.txt |
3900          parallel --plus --results logdir/testjobs.txt_{0#}.output \
3901            '{= if(/^#tag /) { @tag = split/,|\s+/ }
3902                (grep /^(classB|classC)$/, @tag) or skip =}'
3903
3904 https://github.com/beomagi/parallelium
3905 (Last checked: 2023-01)
3906
3907
3908 =head2 DIFFERENCES BETWEEN forkrun AND GNU Parallel
3909
3910 Summary (see legend above):
3911
3912 =over
3913
3914 =item I1 - - - - - I7
3915
3916 =item - - - - - -
3917
3918 =item - O2 O3 - O5 - - - - O10
3919
3920 =item E1 - - E4 - - -
3921
3922 =item - - - - - - - - -
3923
3924 =item - -
3925
3926 =back
3927
3928
3929 B<forkrun> blocks if it receives fewer jobs than slots:
3930
3931   echo | forkrun -p 2 echo
3932
3933 or when it gets some specific commands e.g.:
3934
3935   f() { seq "$@" | pv -qL 3; }
3936   seq 10 | forkrun f
3937
3938 It is not clear why.
3939
3940 It is faster than GNU B<parallel> (overhead: 1.2 ms/job vs 3 ms/job),
3941 but way slower than B<parallel-bash> (0.059 ms/job).
3942
3943 Running jobs cannot be stopped by pressing CTRL-C.
3944
3945 B<-k> is supposed to keep the order but fails on the MIX testing
3946 example below. If used with B<-k> it caches output in RAM.
3947
3948 If B<forkrun> is killed, it leaves temporary files in
3949 B</tmp/.forkrun.*> that has to be cleaned up manually.
3950
3951 =head3 EXAMPLES
3952
3953   1$ time find ./ -type f |
3954        forkrun -l512 -- sha256sum 2>/dev/null | wc -l
3955   1$ time find ./ -type f |
3956        parallel -j28 -m -- sha256sum 2>/dev/null | wc -l
3957
3958   2$ time find ./ -type f |
3959        forkrun -l512 -k -- sha256sum 2>/dev/null | wc -l
3960   2$ time find ./ -type f |
3961        parallel -j28 -k -m -- sha256sum 2>/dev/null | wc -l
3962
3963 https://github.com/jkool702/forkrun
3964 (Last checked: 2023-02)
3965
3966
3967 =head2 DIFFERENCES BETWEEN parallel-sh AND GNU Parallel
3968
3969 Summary (see legend above):
3970
3971 =over
3972
3973 =item I1 I2 - I4 - - -
3974
3975 =item M1 - - - - M6
3976
3977 =item O1 O2 O3 - O5 O6 - - - O10
3978
3979 =item E1 - - E4 - - -
3980
3981 =item - - - - - - - - -
3982
3983 =item - -
3984
3985 =back
3986
3987 B<parallel-sh> buffers in RAM. The buffering data takes O(n^1.5) time:
3988
3989 2MB=0.107s 4MB=0.175s 8MB=0.342s 16MB=0.766s 32MB=2.2s 64MB=6.7s
3990 128MB=20s 256MB=64s 512MB=248s 1024MB=998s 2048MB=3756s
3991
3992 It limits the practical usability to jobs outputting < 256 MB. GNU
3993 B<parallel> buffers on disk, yet is faster for jobs with outputs > 16
3994 MB and is only limited by the free space in $TMPDIR.
3995
3996 B<parallel-sh> can kill running jobs if a job fails (Similar to
3997 B<--halt now,fail=1>).
3998
3999 =head3 EXAMPLES
4000
4001   1$ parallel-sh "sleep 2 && echo first" "sleep 1 && echo second"
4002
4003   1$ parallel ::: "sleep 2 && echo first" "sleep 1 && echo second"
4004
4005   2$ cat /tmp/commands
4006      sleep 2 && echo first
4007      sleep 1 && echo second
4008
4009   2$ parallel-sh -f /tmp/commands
4010
4011   2$ parallel -a /tmp/commands
4012
4013   3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4014        parallel-sh
4015
4016   3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4017        parallel
4018
4019 https://github.com/thyrc/parallel-sh
4020 (Last checked: 2023-04)
4021
4022
4023 =head2 DIFFERENCES BETWEEN bash-parallel AND GNU Parallel
4024
4025 Summary (see legend above):
4026
4027 =over
4028
4029 =item - I2 - - - - I7
4030
4031 =item M1 - M3 - M5 M6
4032
4033 =item - O2 O3 - - O6 - O8 - O10
4034
4035 =item E1 - - - - - -
4036
4037 =item - - - - - - - - -
4038
4039 =item - -
4040
4041 =back
4042
4043 B<bash-parallel> is not as much a command as it is a shell script that
4044 you have to alter. It requires you to change the shell function
4045 process_job that runs the job, and set $MAX_POOL_SIZE to the number of
4046 jobs to run in parallel.
4047
4048 It is half as fast as GNU B<parallel> for short jobs.
4049
4050 https://github.com/thilinaba/bash-parallel
4051 (Last checked: 2023-05)
4052
4053
4054 =head2 DIFFERENCES BETWEEN PaSH AND GNU Parallel
4055
4056 Summary (see legend above): N/A
4057
4058 B<pash> is quite different from GNU B<parallel>. It is not a general
4059 parallelizer. It takes a shell script and analyses it and parallelizes
4060 parts of it by replacing the parts with commands that will give the same
4061 result.
4062
4063 This will replace B<sort> with a command that does pretty much the
4064 same as B<parsort --parallel=8> (except somewhat slower):
4065
4066   pa.sh --width 8 -c 'cat bigfile | sort'
4067
4068 However, even a simple change will confuse B<pash> and you will get no
4069 parallelization:
4070
4071   pa.sh --width 8 -c 'mysort() { sort; }; cat bigfile | mysort'
4072   pa.sh --width 8 -c 'cat bigfile | sort | md5sum'
4073
4074 From the source it seems B<pash> only looks at: awk cat col comm cut
4075 diff grep head mkfifo mv rm sed seq sort tail tee tr uniq wc xargs
4076
4077 For pipelines where these commands are bottlenecks, it might be worth
4078 testing if B<pash> is faster than GNU B<parallel>.
4079
4080 B<pash> does not respect $TMPDIR but always uses /tmp. If B<pash> dies
4081 unexpectantly it does not clean up.
4082
4083 https://github.com/binpash/pash
4084 (Last checked: 2023-05)
4085
4086
4087 =head2 DIFFERENCES BETWEEN korovkin-parallel AND GNU Parallel
4088
4089 Summary (see legend above):
4090
4091 =over
4092
4093 =item I1 - - - - - -
4094
4095 =item M1 - - - - M6
4096
4097 =item - - O3 - - - - x x -
4098
4099 =item E1 - - - - - -
4100
4101 =item R1 - - - - R6 x x -
4102
4103 =item - -
4104
4105 =back
4106
4107 B<korovkin-parallel> prepends all lines with some info.
4108
4109 The output is colored with 6 color combinations, so job 1 and 7 will
4110 get the same color.
4111
4112 You can get similar output with:
4113
4114   (echo ...) |
4115     parallel --color -j 10 --lb --tagstring \
4116       '[l:{#}:{=$_=sprintf("%7.03f",::now()-$^T)=} {=$_=hh_mm_ss($^T)=} {%}]'
4117
4118 Lines longer than 8192 chars are broken into lines shorter than
4119 8192. B<korovkin-parallel> loses the last char for lines exactly 8193
4120 chars long.
4121
4122 Short lines from different jobs do not mix, but long lines do:
4123
4124   fun() {
4125     perl -e '$a="'$1'"x1000000; for(1..'$2') { print $a };';
4126     echo;
4127   }
4128   export -f fun
4129   (echo fun a 100;echo fun b 100) | korovkin-parallel | tr -s abcdef
4130   # Compare to:
4131   (echo fun a 100;echo fun b 100) | parallel | tr -s abcdef
4132
4133 There should be only one line of a's and one line of b's.
4134
4135 Just like GNU B<parallel> B<korovkin-parallel> offers a master/slave
4136 model, so workers on other servers can do some of the tasks. But
4137 contrary to GNU B<parallel> you must manually start workers on these
4138 servers. The communication is neither authenticated nor encrypted.
4139
4140 It caches output in RAM: a 1GB line uses ~2.5GB RAM
4141
4142 https://github.com/korovkin/parallel
4143 (Last checked: 2023-07)
4144
4145
4146 =head2 DIFFERENCES BETWEEN xe AND GNU Parallel
4147
4148 Summary (see legend above):
4149
4150 =over
4151
4152 =item I1 I2 - I4 - - I7
4153
4154 =item M1 - M3 M4 - M6
4155
4156 =item - O2 O3 - O5 O6 - O8 - O10
4157
4158 =item E1 - - E4 - - -
4159
4160 =item - - - - - - - - -
4161
4162 =item - -
4163
4164 =back
4165
4166 B<xe> has a peculiar limitation:
4167
4168   echo /bin/echo | xe {} OK
4169   echo echo | xe /bin/{} fails
4170
4171
4172 =head3 EXAMPLES
4173
4174 Compress all .c files in the current directory, using all CPU cores:
4175
4176   1$ xe -a -j0 gzip -- *.c
4177
4178   1$ parallel gzip ::: *.c
4179
4180 Remove all empty files, using lr(1):
4181
4182   2$ lr -U -t 'size == 0' | xe -N0 rm
4183
4184   2$ lr -U -t 'size == 0' | parallel -X rm
4185
4186 Convert .mp3 to .ogg, using all CPU cores:
4187
4188   3$ xe -a -j0 -s 'ffmpeg -i "${1}" "${1%.mp3}.ogg"' -- *.mp3
4189
4190   3$ parallel ffmpeg -i {} {.}.ogg ::: *.mp3
4191
4192 Same, using percent rules:
4193
4194   4$ xe -a -j0 -p %.mp3 ffmpeg -i %.mp3 %.ogg -- *.mp3
4195
4196   4$ parallel --rpl '% s/\.mp3// or skip' ffmpeg -i %.mp3 %.ogg ::: *.mp3
4197
4198 Similar, but hiding output of ffmpeg, instead showing spawned jobs:
4199
4200   5$ xe -ap -j0 -vvq '%.{m4a,ogg,opus}' ffmpeg -y -i {} out/%.mp3 -- *
4201
4202   5$ parallel -v --rpl '% s/\.(m4a|ogg|opus)// or skip' \
4203        ffmpeg -y -i {} out/%.mp3 '2>/dev/null' ::: *
4204
4205   5$ parallel -v ffmpeg -y -i {} out/{.}.mp3 '2>/dev/null' ::: *
4206
4207 https://github.com/leahneukirchen/xe
4208 (Last checked: 2023-08)
4209
4210
4211 =head2 DIFFERENCES BETWEEN sp AND GNU Parallel
4212
4213 Summary (see legend above):
4214
4215 =over
4216
4217 =item - - - I4 - - -
4218
4219 =item M1 - M3 - - M6
4220
4221 =item - O2 O3 - O5 (O6) - x x O10
4222
4223 =item E1 - - - - - -
4224
4225 =item - - - - - - - - -
4226
4227 =item - -
4228
4229 =back
4230
4231 B<sp> has very few options.
4232
4233 It can either be used like:
4234
4235   sp command {} option :: arg1 arg2 arg3
4236
4237 which is similar to:
4238
4239   parallel command {} option ::: arg1 arg2 arg3
4240
4241 Or:
4242
4243   sp command1 :: "command2 -option" :: "command3 foo bar"
4244
4245 which is similar to:
4246
4247   parallel ::: command1 "command2 -option" "command3 foo bar"
4248
4249 B<sp> deals badly with too many commands: This causes B<sp> to run out
4250 of file handles and gives data loss.
4251
4252 For each command that fails, B<sp> will print an error message on
4253 stderr (standard error).
4254
4255 You cannot used exported shell functions as commands.
4256
4257 =head3 EXAMPLES
4258
4259   1$ sp echo {} :: 1 2 3
4260
4261   1$ parallel echo {} ::: 1 2 3
4262
4263   2$ sp echo {} {} :: 1 2 3
4264
4265   2$ parallel echo {} {} :: 1 2 3
4266
4267   3$ sp echo 1 :: echo 2 :: echo 3
4268
4269   3$ parallel ::: 'echo 1' 'echo 2' 'echo 3'
4270
4271   4$ sp a foo bar :: "b 'baz  bar'" :: c
4272
4273   4$ parallel ::: 'a foo bar' "b 'baz  bar'" :: c
4274
4275 https://github.com/SergioBenitez/sp
4276 (Last checked: 2023-10)
4277
4278
4279 =head2 Todo
4280
4281 https://github.com/justanhduc/task-spooler
4282
4283 https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html
4284
4285 https://www.npmjs.com/package/concurrently
4286
4287 http://code.google.com/p/push/ (cannot compile)
4288
4289 https://github.com/krashanoff/parallel
4290
4291 https://github.com/Nukesor/pueue
4292
4293 https://arxiv.org/pdf/2012.15443.pdf KumQuat
4294
4295 https://github.com/JeiKeiLim/simple_distribute_job
4296
4297 https://github.com/reggi/pkgrun - not obvious how to use
4298
4299 https://github.com/benoror/better-npm-run - not obvious how to use
4300
4301 https://github.com/bahmutov/with-package
4302
4303 https://github.com/flesler/parallel
4304
4305 https://github.com/Julian/Verge
4306
4307 https://vicerveza.homeunix.net/~viric/soft/ts/
4308
4309 https://github.com/chapmanjacobd/que
4310
4311
4312
4313 =head1 TESTING OTHER TOOLS
4314
4315 There are certain issues that are very common on parallelizing
4316 tools. Here are a few stress tests. Be warned: If the tool is badly
4317 coded it may overload your machine.
4318
4319
4320 =head2 MIX: Output mixes
4321
4322 Output from 2 jobs should not mix. If the output is not used, this
4323 does not matter; but if the output I<is> used then it is important
4324 that you do not get half a line from one job followed by half a line
4325 from another job.
4326
4327 If the tool does not buffer, output will most likely mix now and then.
4328
4329 This test stresses whether output mixes.
4330
4331   #!/bin/bash
4332
4333   paralleltool="parallel -j 30"
4334
4335   cat <<-EOF > mycommand
4336   #!/bin/bash
4337
4338   # If a, b, c, d, e, and f mix: Very bad
4339   perl -e 'print STDOUT "a"x3000_000," "'
4340   perl -e 'print STDERR "b"x3000_000," "'
4341   perl -e 'print STDOUT "c"x3000_000," "'
4342   perl -e 'print STDERR "d"x3000_000," "'
4343   perl -e 'print STDOUT "e"x3000_000," "'
4344   perl -e 'print STDERR "f"x3000_000," "'
4345   echo
4346   echo >&2
4347   EOF
4348   chmod +x mycommand
4349
4350   # Run 30 jobs in parallel
4351   seq 30 |
4352     $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
4353
4354   # 'a c e' and 'b d f' should always stay together
4355   # and there should only be a single line per job
4356
4357
4358 =head2 STDERRMERGE: Stderr is merged with stdout
4359
4360 Output from stdout and stderr should not be merged, but kept separated.
4361
4362 This test shows whether stdout is mixed with stderr.
4363
4364   #!/bin/bash
4365
4366   paralleltool="parallel -j0"
4367
4368   cat <<-EOF > mycommand
4369   #!/bin/bash
4370
4371   echo stdout
4372   echo stderr >&2
4373   echo stdout
4374   echo stderr >&2
4375   EOF
4376   chmod +x mycommand
4377
4378   # Run one job
4379   echo |
4380     $paralleltool ./mycommand > stdout 2> stderr
4381   cat stdout
4382   cat stderr
4383
4384
4385 =head2 RAM: Output limited by RAM
4386
4387 Some tools cache output in RAM. This makes them extremely slow if the
4388 output is bigger than physical memory and crash if the output is
4389 bigger than the virtual memory.
4390
4391   #!/bin/bash
4392
4393   paralleltool="parallel -j0"
4394
4395   cat <<'EOF' > mycommand
4396   #!/bin/bash
4397
4398   # Generate 1 GB output
4399   yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
4400   EOF
4401   chmod +x mycommand
4402
4403   # Run 20 jobs in parallel
4404   # Adjust 20 to be > physical RAM and < free space on /tmp
4405   seq 20 | time $paralleltool ./mycommand | wc -c
4406
4407
4408 =head2 DISKFULL: Incomplete data if /tmp runs full
4409
4410 If caching is done on disk, the disk can run full during the run. Not
4411 all programs discover this. GNU Parallel discovers it, if it stays
4412 full for at least 2 seconds.
4413
4414   #!/bin/bash
4415
4416   paralleltool="parallel -j0"
4417
4418   # This should be a dir with less than 100 GB free space
4419   smalldisk=/tmp/shm/parallel
4420
4421   TMPDIR="$smalldisk"
4422   export TMPDIR
4423
4424   max_output() {
4425       # Force worst case scenario:
4426       # Make GNU Parallel only check once per second
4427       sleep 10
4428       # Generate 100 GB to fill $TMPDIR
4429       # Adjust if /tmp is bigger than 100 GB
4430       yes | head -c 100G >$TMPDIR/$$
4431       # Generate 10 MB output that will not be buffered
4432       # due to full disk
4433       perl -e 'print "X"x10_000_000' | head -c 10M
4434       echo This part is missing from incomplete output
4435       sleep 2
4436       rm $TMPDIR/$$
4437       echo Final output
4438   }
4439
4440   export -f max_output
4441   seq 10 | $paralleltool max_output | tr -s X
4442
4443
4444 =head2 CLEANUP: Leaving tmp files at unexpected death
4445
4446 Some tools do not clean up tmp files if they are killed. If the tool
4447 buffers on disk, they may not clean up, if they are killed.
4448
4449   #!/bin/bash
4450
4451   paralleltool=parallel
4452
4453   ls /tmp >/tmp/before
4454   seq 10 | $paralleltool sleep &
4455   pid=$!
4456   # Give the tool time to start up
4457   sleep 1
4458   # Kill it without giving it a chance to cleanup
4459   kill -9 $!
4460   # Should be empty: No files should be left behind
4461   diff <(ls /tmp) /tmp/before
4462
4463
4464 =head2 SPCCHAR: Dealing badly with special file names.
4465
4466 It is not uncommon for users to create files like:
4467
4468   My brother's 12" *** record  (costs $$$).jpg
4469
4470 Some tools break on this.
4471
4472   #!/bin/bash
4473
4474   paralleltool=parallel
4475
4476   touch "My brother's 12\" *** record  (costs \$\$\$).jpg"
4477   ls My*jpg | $paralleltool ls -l
4478
4479
4480 =head2 COMPOSED: Composed commands do not work
4481
4482 Some tools require you to wrap composed commands into B<bash -c>.
4483
4484   echo bar | $paralleltool echo foo';' echo {}
4485
4486
4487 =head2 ONEREP: Only one replacement string allowed
4488
4489 Some tools can only insert the argument once.
4490
4491   echo bar | $paralleltool echo {} foo {}
4492
4493
4494 =head2 INPUTSIZE: Length of input should not be limited
4495
4496 Some tools limit the length of the input lines artificially with no good
4497 reason. GNU B<parallel> does not:
4498
4499   perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
4500
4501 GNU B<parallel> limits the command to run to 128 KB due to execve(1):
4502
4503   perl -e 'print "x"x131_000' | parallel echo {} | wc
4504
4505
4506 =head2 NUMWORDS: Speed depends on number of words
4507
4508 Some tools become very slow if output lines have many words.
4509
4510   #!/bin/bash
4511
4512   paralleltool=parallel
4513
4514   cat <<-EOF > mycommand
4515   #!/bin/bash
4516
4517   # 10 MB of lines with 1000 words
4518   yes "`seq 1000`" | head -c 10M
4519   EOF
4520   chmod +x mycommand
4521
4522   # Run 30 jobs in parallel
4523   seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
4524
4525 =head2 4GB: Output with a line > 4GB should be OK
4526
4527   #!/bin/bash
4528
4529   paralleltool="parallel -j0"
4530
4531   cat <<-EOF > mycommand
4532   #!/bin/bash
4533
4534   perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
4535   EOF
4536   chmod +x mycommand
4537
4538   # Run 1 job
4539   seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
4540
4541
4542 =head1 AUTHOR
4543
4544 When using GNU B<parallel> for a publication please cite:
4545
4546 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4547 The USENIX Magazine, February 2011:42-47.
4548
4549 This helps funding further development; and it won't cost you a cent.
4550 If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4551
4552 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4553
4554 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4555
4556 Copyright (C) 2010-2023 Ole Tange, http://ole.tange.dk and Free
4557 Software Foundation, Inc.
4558
4559 Parts of the manual concerning B<xargs> compatibility is inspired by
4560 the manual of B<xargs> from GNU findutils 4.4.2.
4561
4562
4563 =head1 LICENSE
4564
4565 This program is free software; you can redistribute it and/or modify
4566 it under the terms of the GNU General Public License as published by
4567 the Free Software Foundation; either version 3 of the License, or
4568 at your option any later version.
4569
4570 This program is distributed in the hope that it will be useful,
4571 but WITHOUT ANY WARRANTY; without even the implied warranty of
4572 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4573 GNU General Public License for more details.
4574
4575 You should have received a copy of the GNU General Public License
4576 along with this program.  If not, see <https://www.gnu.org/licenses/>.
4577
4578 =head2 Documentation license I
4579
4580 Permission is granted to copy, distribute and/or modify this
4581 documentation under the terms of the GNU Free Documentation License,
4582 Version 1.3 or any later version published by the Free Software
4583 Foundation; with no Invariant Sections, with no Front-Cover Texts, and
4584 with no Back-Cover Texts.  A copy of the license is included in the
4585 file LICENSES/GFDL-1.3-or-later.txt.
4586
4587 =head2 Documentation license II
4588
4589 You are free:
4590
4591 =over 9
4592
4593 =item B<to Share>
4594
4595 to copy, distribute and transmit the work
4596
4597 =item B<to Remix>
4598
4599 to adapt the work
4600
4601 =back
4602
4603 Under the following conditions:
4604
4605 =over 9
4606
4607 =item B<Attribution>
4608
4609 You must attribute the work in the manner specified by the author or
4610 licensor (but not in any way that suggests that they endorse you or
4611 your use of the work).
4612
4613 =item B<Share Alike>
4614
4615 If you alter, transform, or build upon this work, you may distribute
4616 the resulting work only under the same, similar or a compatible
4617 license.
4618
4619 =back
4620
4621 With the understanding that:
4622
4623 =over 9
4624
4625 =item B<Waiver>
4626
4627 Any of the above conditions can be waived if you get permission from
4628 the copyright holder.
4629
4630 =item B<Public Domain>
4631
4632 Where the work or any of its elements is in the public domain under
4633 applicable law, that status is in no way affected by the license.
4634
4635 =item B<Other Rights>
4636
4637 In no way are any of the following rights affected by the license:
4638
4639 =over 2
4640
4641 =item *
4642
4643 Your fair dealing or fair use rights, or other applicable
4644 copyright exceptions and limitations;
4645
4646 =item *
4647
4648 The author's moral rights;
4649
4650 =item *
4651
4652 Rights other persons may have either in the work itself or in
4653 how the work is used, such as publicity or privacy rights.
4654
4655 =back
4656
4657 =back
4658
4659 =over 9
4660
4661 =item B<Notice>
4662
4663 For any reuse or distribution, you must make clear to others the
4664 license terms of this work.
4665
4666 =back
4667
4668 A copy of the full license is included in the file as
4669 LICENCES/CC-BY-SA-4.0.txt
4670
4671
4672 =head1 DEPENDENCIES
4673
4674 GNU B<parallel> uses Perl, and the Perl modules Getopt::Long,
4675 IPC::Open3, Symbol, IO::File, POSIX, and File::Temp. For remote usage
4676 it also uses rsync with ssh.
4677
4678
4679 =head1 SEE ALSO
4680
4681 B<find>(1), B<xargs>(1), B<make>(1), B<pexec>(1), B<ppss>(1),
4682 B<xjobs>(1), B<prll>(1), B<dxargs>(1), B<mdm>(1)
4683
4684 =cut