Released as 20231022 ('Al-Aqsa Deluge')
[parallel.git] / src / parallel_alternatives.pod
blobbb7f2b7c3be3e170376c24e2b05683e3fd9a54ae
1 #!/usr/bin/perl -w
3 # SPDX-FileCopyrightText: 2021-2023 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
4 # SPDX-License-Identifier: GFDL-1.3-or-later
5 # SPDX-License-Identifier: CC-BY-SA-4.0
7 =encoding utf8
9 =head1 NAME
11 parallel_alternatives - Alternatives to GNU B<parallel>
14 =head1 DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES
16 There are a lot programs that share functionality with GNU
17 B<parallel>. Some of these are specialized tools, and while GNU
18 B<parallel> can emulate many of them, a specialized tool can be better
19 at a given task. GNU B<parallel> strives to include the best of the
20 general functionality without sacrificing ease of use.
22 B<parallel> has existed since 2002-01-06 and as GNU B<parallel> since
23 2010. A lot of the alternatives have not had the vitality to survive
24 that long, but have come and gone during that time.
26 GNU B<parallel> is actively maintained with a new release every month
27 since 2010. Most other alternatives are fleeting interests of the
28 developers with irregular releases and only maintained for a few
29 years.
32 =head2 SUMMARY LEGEND
34 The following features are in some of the comparable tools:
36 =head3 Inputs
38 =over
40 =item I1. Arguments can be read from stdin
42 =item I2. Arguments can be read from a file
44 =item I3. Arguments can be read from multiple files
46 =item I4. Arguments can be read from command line
48 =item I5. Arguments can be read from a table
50 =item I6. Arguments can be read from the same file using #! (shebang)
52 =item I7. Line oriented input as default (Quoting of special chars not needed)
54 =back
57 =head3 Manipulation of input
59 =over
61 =item M1. Composed command
63 =item M2. Multiple arguments can fill up an execution line
65 =item M3. Arguments can be put anywhere in the execution line
67 =item M4. Multiple arguments can be put anywhere in the execution line
69 =item M5. Arguments can be replaced with context
71 =item M6. Input can be treated as the complete command line
73 =back
76 =head3 Outputs
78 =over
80 =item O1. Grouping output so output from different jobs do not mix
82 =item O2. Send stderr (standard error) to stderr (standard error)
84 =item O3. Send stdout (standard output) to stdout (standard output)
86 =item O4. Order of output can be same as order of input
88 =item O5. Stdout only contains stdout (standard output) from the command
90 =item O6. Stderr only contains stderr (standard error) from the command
92 =item O7. Buffering on disk
94 =item O8. No temporary files left if killed
96 =item O9. Test if disk runs full during run
98 =item O10. Output of a line bigger than 4 GB
100 =back
103 =head3 Execution
105 =over
107 =item E1. Run jobs in parallel
109 =item E2. List running jobs
111 =item E3. Finish running jobs, but do not start new jobs
113 =item E4. Number of running jobs can depend on number of cpus
115 =item E5. Finish running jobs, but do not start new jobs after first failure
117 =item E6. Number of running jobs can be adjusted while running
119 =item E7. Only spawn new jobs if load is less than a limit
121 =back
124 =head3 Remote execution
126 =over
128 =item R1. Jobs can be run on remote computers
130 =item R2. Basefiles can be transferred
132 =item R3. Argument files can be transferred
134 =item R4. Result files can be transferred
136 =item R5. Cleanup of transferred files
138 =item R6. No config files needed
140 =item R7. Do not run more than SSHD's MaxStartups can handle
142 =item R8. Configurable SSH command
144 =item R9. Retry if connection breaks occasionally
146 =back
149 =head3 Semaphore
151 =over
153 =item S1. Possibility to work as a mutex
155 =item S2. Possibility to work as a counting semaphore
157 =back
160 =head3 Legend
162 =over
164 =item - = no
166 =item x = not applicable
168 =item ID = yes
170 =back
172 As every new version of the programs are not tested the table may be
173 outdated. Please file a bug report if you find errors (See REPORTING
174 BUGS).
176 parallel:
178 =over
180 =item I1 I2 I3 I4 I5 I6 I7
182 =item M1 M2 M3 M4 M5 M6
184 =item O1 O2 O3 O4 O5 O6 O7 O8 O9 O10
186 =item E1 E2 E3 E4 E5 E6 E7
188 =item R1 R2 R3 R4 R5 R6 R7 R8 R9
190 =item S1 S2
192 =back
195 =head2 DIFFERENCES BETWEEN xargs AND GNU Parallel
197 Summary (see legend above):
199 =over
201 =item I1 I2 - - - - -
203 =item - M2 M3 - - -
205 =item - O2 O3 - O5 O6
207 =item E1 - - - - - -
209 =item - - - - - x - - -
211 =item - -
213 =back
215 B<xargs> offers some of the same possibilities as GNU B<parallel>.
217 B<xargs> deals badly with special characters (such as space, \, ' and
218 "). To see the problem try this:
220 touch important_file
221 touch 'not important_file'
222 ls not* | xargs rm
223 mkdir -p "My brother's 12\" records"
224 ls | xargs rmdir
225 touch 'c:\windows\system32\clfs.sys'
226 echo 'c:\windows\system32\clfs.sys' | xargs ls -l
228 You can specify B<-0>, but many input generators are not optimized for
229 using B<NUL> as separator but are optimized for B<newline> as
230 separator. E.g. B<awk>, B<ls>, B<echo>, B<tar -v>, B<head> (requires
231 using B<-z>), B<tail> (requires using B<-z>), B<sed> (requires using
232 B<-z>), B<perl> (B<-0> and \0 instead of \n), B<locate> (requires
233 using B<-0>), B<find> (requires using B<-print0>), B<grep> (requires
234 using B<-z> or B<-Z>), B<sort> (requires using B<-z>).
236 GNU B<parallel>'s newline separation can be emulated with:
238 cat | xargs -d "\n" -n1 command
240 B<xargs> can run a given number of jobs in parallel, but has no
241 support for running number-of-cpu-cores jobs in parallel.
243 B<xargs> has no support for grouping the output, therefore output may
244 run together, e.g. the first half of a line is from one process and
245 the last half of the line is from another process. The example
246 B<Parallel grep> cannot be done reliably with B<xargs> because of
247 this. To see this in action try:
249 parallel perl -e "'"'$a="1"."{}"x10000000;print $a,"\n"'"'" \
250 '>' {} ::: a b c d e f g h
251 # Serial = no mixing = the wanted result
252 # 'tr -s a-z' squeezes repeating letters into a single letter
253 echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
254 # Compare to 8 jobs in parallel
255 parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
256 echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
257 echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
258 tr -s a-z
260 Or try this:
262 slow_seq() {
263 echo Count to "$@"
264 seq "$@" |
265 perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
267 export -f slow_seq
268 # Serial = no mixing = the wanted result
269 seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
270 # Compare to 8 jobs in parallel
271 seq 8 | parallel -P8 slow_seq {}
272 seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
274 B<xargs> has no support for keeping the order of the output, therefore
275 if running jobs in parallel using B<xargs> the output of the second
276 job cannot be postponed till the first job is done.
278 B<xargs> has no support for running jobs on remote computers.
280 B<xargs> has no support for context replace, so you will have to create the
281 arguments.
283 If you use a replace string in B<xargs> (B<-I>) you can not force
284 B<xargs> to use more than one argument.
286 Quoting in B<xargs> works like B<-q> in GNU B<parallel>. This means
287 composed commands and redirection require using B<bash -c>.
289 ls | parallel "wc {} >{}.wc"
290 ls | parallel "echo {}; ls {}|wc"
292 becomes (assuming you have 8 cores and that none of the filenames
293 contain space, " or ').
295 ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
296 ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
298 A more extreme example can be found on:
299 https://unix.stackexchange.com/q/405552/
301 https://www.gnu.org/software/findutils/
304 =head2 DIFFERENCES BETWEEN find -exec AND GNU Parallel
306 Summary (see legend above):
308 =over
310 =item - - - x - x -
312 =item - M2 M3 - - - -
314 =item - O2 O3 O4 O5 O6
316 =item - - - - - - -
318 =item - - - - - - - - -
320 =item x x
322 =back
324 B<find -exec> offers some of the same possibilities as GNU B<parallel>.
326 B<find -exec> only works on files. Processing other input (such as
327 hosts or URLs) will require creating these inputs as files. B<find
328 -exec> has no support for running commands in parallel.
330 https://www.gnu.org/software/findutils/
331 (Last checked: 2019-01)
334 =head2 DIFFERENCES BETWEEN make -j AND GNU Parallel
336 Summary (see legend above):
338 =over
340 =item - - - - - - -
342 =item - - - - - -
344 =item O1 O2 O3 - x O6
346 =item E1 - - - E5 -
348 =item - - - - - - - - -
350 =item - -
352 =back
354 B<make -j> can run jobs in parallel, but requires a crafted Makefile
355 to do this. That results in extra quoting to get filenames containing
356 newlines to work correctly.
358 B<make -j> computes a dependency graph before running jobs. Jobs run
359 by GNU B<parallel> does not depend on each other.
361 (Very early versions of GNU B<parallel> were coincidentally implemented
362 using B<make -j>).
364 https://www.gnu.org/software/make/
365 (Last checked: 2019-01)
368 =head2 DIFFERENCES BETWEEN ppss AND GNU Parallel
370 Summary (see legend above):
372 =over
374 =item I1 I2 - - - - I7
376 =item M1 - M3 - - M6
378 =item O1 - - x - -
380 =item E1 E2 ?E3 E4 - - -
382 =item R1 R2 R3 R4 - - ?R7 ? ?
384 =item - -
386 =back
388 B<ppss> is also a tool for running jobs in parallel.
390 The output of B<ppss> is status information and thus not useful for
391 using as input for another command. The output from the jobs are put
392 into files.
394 The argument replace string ($ITEM) cannot be changed. Arguments must
395 be quoted - thus arguments containing special characters (space '"&!*)
396 may cause problems. More than one argument is not supported. Filenames
397 containing newlines are not processed correctly. When reading input
398 from a file null cannot be used as a terminator. B<ppss> needs to read
399 the whole input file before starting any jobs.
401 Output and status information is stored in ppss_dir and thus requires
402 cleanup when completed. If the dir is not removed before running
403 B<ppss> again it may cause nothing to happen as B<ppss> thinks the
404 task is already done. GNU B<parallel> will normally not need cleaning
405 up if running locally and will only need cleaning up if stopped
406 abnormally and running remote (B<--cleanup> may not complete if
407 stopped abnormally). The example B<Parallel grep> would require extra
408 postprocessing if written using B<ppss>.
410 For remote systems PPSS requires 3 steps: config, deploy, and
411 start. GNU B<parallel> only requires one step.
413 =head3 EXAMPLES FROM ppss MANUAL
415 Here are the examples from B<ppss>'s manual page with the equivalent
416 using GNU B<parallel>:
418 1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
420 1$ find /path/to/files -type f | parallel gzip
422 2$ ./ppss.sh standalone -d /path/to/files \
423 -c 'cp "$ITEM" /destination/dir '
425 2$ find /path/to/files -type f | parallel cp {} /destination/dir
427 3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
429 3$ parallel -a list-of-urls.txt wget -q
431 4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
433 4$ parallel -a list-of-urls.txt wget -q {}
435 5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
436 -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
437 -n nodes.txt -o /some/output/dir --upload --download;
438 ./ppss deploy -C config.cfg
439 ./ppss start -C config
441 5$ # parallel does not use configs. If you want
442 # a different username put it in nodes.txt: user@hostname
443 find source/dir -type f |
444 parallel --sshloginfile nodes.txt --trc {.}.mp3 \
445 lame -a {} -o {.}.mp3 --preset standard --quiet
447 6$ ./ppss stop -C config.cfg
449 6$ killall -TERM parallel
451 7$ ./ppss pause -C config.cfg
453 7$ Press: CTRL-Z or killall -SIGTSTP parallel
455 8$ ./ppss continue -C config.cfg
457 8$ Enter: fg or killall -SIGCONT parallel
459 9$ ./ppss.sh status -C config.cfg
461 9$ killall -SIGUSR2 parallel
463 https://github.com/louwrentius/PPSS
464 (Last checked: 2010-12)
467 =head2 DIFFERENCES BETWEEN pexec AND GNU Parallel
469 Summary (see legend above):
471 =over
473 =item I1 I2 - I4 I5 - -
475 =item M1 - M3 - - M6
477 =item O1 O2 O3 - O5 O6
479 =item E1 - - E4 - E6 -
481 =item R1 - - - - R6 - - -
483 =item S1 -
485 =back
487 B<pexec> is also a tool for running jobs in parallel.
489 =head3 EXAMPLES FROM pexec MANUAL
491 Here are the examples from B<pexec>'s info page with the equivalent
492 using GNU B<parallel>:
494 1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
495 'echo "scale=10000;sqrt($NUM)" | bc'
497 1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
498 bc > sqrt-{}.dat'
500 2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
502 2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
504 3$ pexec -f image.list -n auto -e B -u star.log -c -- \
505 'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
507 3$ parallel -a image.list \
508 'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
510 4$ pexec -r *.png -e IMG -c -o - -- \
511 'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
513 4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
515 5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
517 5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
519 6$ for p in *.png ; do echo ${p%.png} ; done | \
520 pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
522 6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
524 7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
525 pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
527 7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
529 8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
530 'pexec -j -m blockread -d $IMG | \
531 jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
532 pexec -j -m blockwrite -s th_$IMG'
534 8$ # Combining GNU B<parallel> and GNU B<sem>.
535 ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
536 'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
538 # If reading and writing is done to the same disk, this may be
539 # faster as only one process will be either reading or writing:
540 ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
541 'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
543 https://www.gnu.org/software/pexec/
544 (Last checked: 2010-12)
547 =head2 DIFFERENCES BETWEEN xjobs AND GNU Parallel
549 B<xjobs> is also a tool for running jobs in parallel. It only supports
550 running jobs on your local computer.
552 B<xjobs> deals badly with special characters just like B<xargs>. See
553 the section B<DIFFERENCES BETWEEN xargs AND GNU Parallel>.
555 =head3 EXAMPLES FROM xjobs MANUAL
557 Here are the examples from B<xjobs>'s man page with the equivalent
558 using GNU B<parallel>:
560 1$ ls -1 *.zip | xjobs unzip
562 1$ ls *.zip | parallel unzip
564 2$ ls -1 *.zip | xjobs -n unzip
566 2$ ls *.zip | parallel unzip >/dev/null
568 3$ find . -name '*.bak' | xjobs gzip
570 3$ find . -name '*.bak' | parallel gzip
572 4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
574 4$ ls *.jar | parallel jar tf {} '>' {}.idx
576 5$ xjobs -s script
578 5$ cat script | parallel
580 6$ mkfifo /var/run/my_named_pipe;
581 xjobs -s /var/run/my_named_pipe &
582 echo unzip 1.zip >> /var/run/my_named_pipe;
583 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
585 6$ mkfifo /var/run/my_named_pipe;
586 cat /var/run/my_named_pipe | parallel &
587 echo unzip 1.zip >> /var/run/my_named_pipe;
588 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
590 https://www.maier-komor.de/xjobs.html
591 (Last checked: 2019-01)
594 =head2 DIFFERENCES BETWEEN prll AND GNU Parallel
596 B<prll> is also a tool for running jobs in parallel. It does not
597 support running jobs on remote computers.
599 B<prll> encourages using BASH aliases and BASH functions instead of
600 scripts. GNU B<parallel> supports scripts directly, functions if they
601 are exported using B<export -f>, and aliases if using B<env_parallel>.
603 B<prll> generates a lot of status information on stderr (standard
604 error) which makes it harder to use the stderr (standard error) output
605 of the job directly as input for another program.
607 =head3 EXAMPLES FROM prll's MANUAL
609 Here is the example from B<prll>'s man page with the equivalent
610 using GNU B<parallel>:
612 1$ prll -s 'mogrify -flip $1' *.jpg
614 1$ parallel mogrify -flip ::: *.jpg
616 https://github.com/exzombie/prll
617 (Last checked: 2019-01)
620 =head2 DIFFERENCES BETWEEN dxargs AND GNU Parallel
622 B<dxargs> is also a tool for running jobs in parallel.
624 B<dxargs> does not deal well with more simultaneous jobs than SSHD's
625 MaxStartups. B<dxargs> is only built for remote run jobs, but does not
626 support transferring of files.
628 https://web.archive.org/web/20120518070250/http://www.
629 semicomplete.com/blog/geekery/distributed-xargs.html
630 (Last checked: 2019-01)
633 =head2 DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
635 middleman(mdm) is also a tool for running jobs in parallel.
637 =head3 EXAMPLES FROM middleman's WEBSITE
639 Here are the shellscripts of
640 https://web.archive.org/web/20110728064735/http://mdm.
641 berlios.de/usage.html ported to GNU B<parallel>:
643 1$ seq 19 | parallel buffon -o - | sort -n > result
644 cat files | parallel cmd
645 find dir -execdir sem cmd {} \;
647 https://github.com/cklin/mdm
648 (Last checked: 2019-01)
651 =head2 DIFFERENCES BETWEEN xapply AND GNU Parallel
653 B<xapply> can run jobs in parallel on the local computer.
655 =head3 EXAMPLES FROM xapply's MANUAL
657 Here are the examples from B<xapply>'s man page with the equivalent
658 using GNU B<parallel>:
660 1$ xapply '(cd %1 && make all)' */
662 1$ parallel 'cd {} && make all' ::: */
664 2$ xapply -f 'diff %1 ../version5/%1' manifest | more
666 2$ parallel diff {} ../version5/{} < manifest | more
668 3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
670 3$ parallel --link diff {1} {2} :::: manifest1 checklist1
672 4$ xapply 'indent' *.c
674 4$ parallel indent ::: *.c
676 5$ find ~ksb/bin -type f ! -perm -111 -print | \
677 xapply -f -v 'chmod a+x' -
679 5$ find ~ksb/bin -type f ! -perm -111 -print | \
680 parallel -v chmod a+x
682 6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
684 6$ sh <(find */ -... | parallel -s 1024 echo vi)
686 6$ find */ -... | parallel -s 1024 -Xuj1 vi
688 7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
690 7$ sh <(find ... | parallel -n5 echo vi)
692 7$ find ... | parallel -n5 -uj1 vi
694 8$ xapply -fn "" /etc/passwd
696 8$ parallel -k echo < /etc/passwd
698 9$ tr ':' '\012' < /etc/passwd | \
699 xapply -7 -nf 'chown %1 %6' - - - - - - -
701 9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
703 10$ xapply '[ -d %1/RCS ] || echo %1' */
705 10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
707 11$ xapply -f '[ -f %1 ] && echo %1' List | ...
709 11$ parallel '[ -f {} ] && echo {}' < List | ...
711 https://www.databits.net/~ksb/msrc/local/bin/xapply/xapply.html (Last
712 checked: 2010-12)
715 =head2 DIFFERENCES BETWEEN AIX apply AND GNU Parallel
717 B<apply> can build command lines based on a template and arguments -
718 very much like GNU B<parallel>. B<apply> does not run jobs in
719 parallel. B<apply> does not use an argument separator (like B<:::>);
720 instead the template must be the first argument.
722 =head3 EXAMPLES FROM IBM's KNOWLEDGE CENTER
724 Here are the examples from IBM's Knowledge Center and the
725 corresponding command using GNU B<parallel>:
727 =head4 To obtain results similar to those of the B<ls> command, enter:
729 1$ apply echo *
730 1$ parallel echo ::: *
732 =head4 To compare the file named a1 to the file named b1, and
733 the file named a2 to the file named b2, enter:
735 2$ apply -2 cmp a1 b1 a2 b2
736 2$ parallel -N2 cmp ::: a1 b1 a2 b2
738 =head4 To run the B<who> command five times, enter:
740 3$ apply -0 who 1 2 3 4 5
741 3$ parallel -N0 who ::: 1 2 3 4 5
743 =head4 To link all files in the current directory to the directory
744 /usr/joe, enter:
746 4$ apply 'ln %1 /usr/joe' *
747 4$ parallel ln {} /usr/joe ::: *
749 https://www-01.ibm.com/support/knowledgecenter/
750 ssw_aix_71/com.ibm.aix.cmds1/apply.htm
751 (Last checked: 2019-01)
754 =head2 DIFFERENCES BETWEEN paexec AND GNU Parallel
756 B<paexec> can run jobs in parallel on both the local and remote computers.
758 B<paexec> requires commands to print a blank line as the last
759 output. This means you will have to write a wrapper for most programs.
761 B<paexec> has a job dependency facility so a job can depend on another
762 job to be executed successfully. Sort of a poor-man's B<make>.
764 =head3 EXAMPLES FROM paexec's EXAMPLE CATALOG
766 Here are the examples from B<paexec>'s example catalog with the equivalent
767 using GNU B<parallel>:
769 =head4 1_div_X_run
771 1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
773 1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
775 =head4 all_substr_run
777 2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
779 2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
781 =head4 cc_wrapper_run
783 3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
784 -n 'host1 host2' \
785 -t '/usr/bin/ssh -x' <<EOF [...]
787 3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
788 -S host1,host2 <<EOF [...]
790 # This is not exactly the same, but avoids the wrapper
791 parallel gcc -O2 -c -o {.}.o {} \
792 -S host1,host2 <<EOF [...]
794 =head4 toupper_run
796 4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
798 4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
800 # Without the wrapper:
801 parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
803 https://github.com/cheusov/paexec
804 (Last checked: 2010-12)
807 =head2 DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
809 Summary (see legend above):
811 =over
813 =item I1 - - I4 - - (I7)
815 =item M1 (M2) M3 (M4) M5 M6
817 =item - O2 O3 - O5 - - x x O10
819 =item E1 - - - - - -
821 =item - - - - - - - - -
823 =item - -
825 =back
827 (I7): Only under special circumstances. See below.
829 (M2+M4): Only if there is a single replacement string.
831 B<map> rejects input with special characters:
833 echo "The Cure" > My\ brother\'s\ 12\"\ records
835 ls | map 'echo %; wc %'
837 It works with GNU B<parallel>:
839 ls | parallel 'echo {}; wc {}'
841 Under some circumstances it also works with B<map>:
843 ls | map 'echo % works %'
845 But tiny changes make it reject the input with special characters:
847 ls | map 'echo % does not work "%"'
849 This means that many UTF-8 characters will be rejected. This is by
850 design. From the web page: "As such, programs that I<quietly handle
851 them, with no warnings at all,> are doing their users a disservice."
853 B<map> delays each job by 0.01 s. This can be emulated by using
854 B<parallel --delay 0.01>.
856 B<map> prints '+' on stderr when a job starts, and '-' when a job
857 finishes. This cannot be disabled. B<parallel> has B<--bar> if you
858 need to see progress.
860 B<map>'s replacement strings (% %D %B %E) can be simulated in GNU
861 B<parallel> by putting this in B<~/.parallel/config>:
863 --rpl '%'
864 --rpl '%D $_=Q(::dirname($_));'
865 --rpl '%B s:.*/::;s:\.[^/.]+$::;'
866 --rpl '%E s:.*\.::'
868 B<map> does not have an argument separator on the command line, but
869 uses the first argument as command. This makes quoting harder which again
870 may affect readability. Compare:
872 map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
874 parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
876 B<map> can do multiple arguments with context replace, but not without
877 context replace:
879 parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
881 map "echo 'BEGIN{'%'}END'" 1 2 3
883 B<map> has no support for grouping. So this gives the wrong results:
885 parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
886 ::: a b c d e f
887 ls -l a b c d e f
888 parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
889 map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
890 map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
891 map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
892 ls -l out*
893 md5sum out*
895 =head3 EXAMPLES FROM map's WEBSITE
897 Here are the examples from B<map>'s web page with the equivalent using
898 GNU B<parallel>:
900 1$ ls *.gif | map convert % %B.png # default max-args: 1
902 1$ ls *.gif | parallel convert {} {.}.png
904 2$ map "mkdir %B; tar -C %B -xf %" *.tgz # default max-args: 1
906 2$ parallel 'mkdir {.}; tar -C {.} -xf {}' ::: *.tgz
908 3$ ls *.gif | map cp % /tmp # default max-args: 100
910 3$ ls *.gif | parallel -X cp {} /tmp
912 4$ ls *.tar | map -n 1 tar -xf %
914 4$ ls *.tar | parallel tar -xf
916 5$ map "cp % /tmp" *.tgz
918 5$ parallel cp {} /tmp ::: *.tgz
920 6$ map "du -sm /home/%/mail" alice bob carol
922 6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
923 or if you prefer running a single job with multiple args:
924 6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
926 7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
928 7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
930 8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
932 8$ export PARALLEL=-j50%
934 https://github.com/sitaramc/map
935 (Last checked: 2020-05)
938 =head2 DIFFERENCES BETWEEN ladon AND GNU Parallel
940 B<ladon> can run multiple jobs on files in parallel.
942 B<ladon> only works on files and the only way to specify files is
943 using a quoted glob string (such as \*.jpg). It is not possible to
944 list the files manually.
946 As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
947 RELPATH
949 These can be simulated using GNU B<parallel> by putting this in
950 B<~/.parallel/config>:
952 --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
953 --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
954 --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
955 --rpl 'EXT s:.*\.::'
956 --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
957 s:\Q$c/\E::;$_=::dirname($_);'
958 --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
959 s:\Q$c/\E::;'
961 B<ladon> deals badly with filenames containing " and newline, and it
962 fails for output larger than 200k:
964 ladon '*' -- seq 36000 | wc
966 =head3 EXAMPLES FROM ladon MANUAL
968 It is assumed that the '--rpl's above are put in B<~/.parallel/config>
969 and that it is run under a shell that supports '**' globbing (such as B<zsh>):
971 1$ ladon "**/*.txt" -- echo RELPATH
973 1$ parallel echo RELPATH ::: **/*.txt
975 2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
977 2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
979 3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
980 -thumbnail 100x100^ -gravity center -extent 100x100 \
981 thumbs/RELPATH
983 3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
984 -thumbnail 100x100^ -gravity center -extent 100x100 \
985 thumbs/RELPATH ::: **/*.jpg
987 4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
989 4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
991 https://github.com/danielgtaylor/ladon
992 (Last checked: 2019-01)
995 =head2 DIFFERENCES BETWEEN jobflow AND GNU Parallel
997 Summary (see legend above):
999 =over
1001 =item I1 - - - - - I7
1003 =item - - M3 - - (M6)
1005 =item O1 O2 O3 - O5 O6 (O7) - - O10
1007 =item E1 - - - - E6 -
1009 =item - - - - - - - - -
1011 =item - -
1013 =back
1016 B<jobflow> can run multiple jobs in parallel.
1018 Just like B<xargs> output from B<jobflow> jobs running in parallel mix
1019 together by default. B<jobflow> can buffer into files with
1020 B<-buffered> (placed in /run/shm), but these are not cleaned up if
1021 B<jobflow> dies unexpectedly (e.g. by Ctrl-C). If the total output is
1022 big (in the order of RAM+swap) it can cause the system to slow to a
1023 crawl and eventually run out of memory.
1025 Just like B<xargs> redirection and composed commands require wrapping
1026 with B<bash -c>.
1028 Input lines can at most be 4096 bytes.
1030 B<jobflow> is faster than GNU B<parallel> but around 6 times slower
1031 than B<parallel-bash>.
1033 B<jobflow> has no equivalent for B<--pipe>, or B<--sshlogin>.
1035 B<jobflow> makes it possible to set resource limits on the running
1036 jobs. This can be emulated by GNU B<parallel> using B<bash>'s B<ulimit>:
1038 jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
1040 parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
1043 =head3 EXAMPLES FROM jobflow README
1045 1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
1047 1$ cat things.list | parallel -j8 ./mytask {}
1049 2$ seq 100 | jobflow -threads=100 -exec echo {}
1051 2$ seq 100 | parallel -j100 echo {}
1053 3$ cat urls.txt | jobflow -threads=32 -exec wget {}
1055 3$ cat urls.txt | parallel -j32 wget {}
1057 4$ find . -name '*.bmp' | \
1058 jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
1060 4$ find . -name '*.bmp' | \
1061 parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
1063 5$ seq 100 | jobflow -skip 10 -count 10
1065 5$ seq 100 | parallel --filter '{1} > 10 and {1} <= 20' echo
1067 5$ seq 100 | parallel echo '{= $_>10 and $_<=20 or skip() =}'
1069 https://github.com/rofl0r/jobflow
1070 (Last checked: 2022-05)
1073 =head2 DIFFERENCES BETWEEN gargs AND GNU Parallel
1075 B<gargs> can run multiple jobs in parallel.
1077 Older versions cache output in memory. This causes it to be extremely
1078 slow when the output is larger than the physical RAM, and can cause
1079 the system to run out of memory.
1081 See more details on this in B<man parallel_design>.
1083 Newer versions cache output in files, but leave files in $TMPDIR if it
1084 is killed.
1086 Output to stderr (standard error) is changed if the command fails.
1088 =head3 EXAMPLES FROM gargs WEBSITE
1090 1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
1092 1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
1094 2$ cat t.txt | gargs --sep "\s+" \
1095 -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
1097 2$ cat t.txt | parallel --colsep "\\s+" \
1098 -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
1100 https://github.com/brentp/gargs
1101 (Last checked: 2016-08)
1104 =head2 DIFFERENCES BETWEEN orgalorg AND GNU Parallel
1106 B<orgalorg> can run the same job on multiple machines. This is related
1107 to B<--onall> and B<--nonall>.
1109 B<orgalorg> supports entering the SSH password - provided it is the
1110 same for all servers. GNU B<parallel> advocates using B<ssh-agent>
1111 instead, but it is possible to emulate B<orgalorg>'s behavior by
1112 setting SSHPASS and by using B<--ssh "sshpass ssh">.
1114 To make the emulation easier, make a simple alias:
1116 alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
1118 If you want to supply a password run:
1120 SSHPASS=`ssh-askpass`
1122 or set the password directly:
1124 SSHPASS=P4$$w0rd!
1126 If the above is set up you can then do:
1128 orgalorg -o frontend1 -o frontend2 -p -C uptime
1129 par_emul -S frontend1 -S frontend2 uptime
1131 orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
1132 par_emul -S frontend1 -S frontend2 top -bid 1
1134 orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
1135 'md5sum /tmp/bigfile' -S bigfile
1136 par_emul -S frontend1 -S frontend2 --basefile bigfile \
1137 --workdir /tmp md5sum /tmp/bigfile
1139 B<orgalorg> has a progress indicator for the transferring of a
1140 file. GNU B<parallel> does not.
1142 https://github.com/reconquest/orgalorg
1143 (Last checked: 2016-08)
1146 =head2 DIFFERENCES BETWEEN Rust parallel(mmstick) AND GNU Parallel
1148 Rust parallel focuses on speed. It is almost as fast as B<xargs>, but
1149 not as fast as B<parallel-bash>. It implements a few features from GNU
1150 B<parallel>, but lacks many functions. All these fail:
1152 # Read arguments from file
1153 parallel -a file echo
1154 # Changing the delimiter
1155 parallel -d _ echo ::: a_b_c_
1157 These do something different from GNU B<parallel>
1159 # -q to protect quoted $ and space
1160 parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
1161 # Generation of combination of inputs
1162 parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
1163 # {= perl expression =} replacement string
1164 parallel echo '{= s/new/old/ =}' ::: my.new your.new
1165 # --pipe
1166 seq 100000 | parallel --pipe wc
1167 # linked arguments
1168 parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
1169 # Run different shell dialects
1170 zsh -c 'parallel echo \={} ::: zsh && true'
1171 csh -c 'parallel echo \$\{\} ::: shell && true'
1172 bash -c 'parallel echo \$\({}\) ::: pwd && true'
1173 # Rust parallel does not start before the last argument is read
1174 (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
1175 tail -f /var/log/syslog | parallel echo
1177 Most of the examples from the book GNU Parallel 2018 do not work, thus
1178 Rust parallel is not close to being a compatible replacement.
1180 Rust parallel has no remote facilities.
1182 It uses /tmp/parallel for tmp files and does not clean up if
1183 terminated abruptly. If another user on the system uses Rust parallel,
1184 then /tmp/parallel will have the wrong permissions and Rust parallel
1185 will fail. A malicious user can setup the right permissions and
1186 symlink the output file to one of the user's files and next time the
1187 user uses Rust parallel it will overwrite this file.
1189 attacker$ mkdir /tmp/parallel
1190 attacker$ chmod a+rwX /tmp/parallel
1191 # Symlink to the file the attacker wants to zero out
1192 attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
1193 victim$ seq 1000 | parallel echo
1194 # This file is now overwritten with stderr from 'echo'
1195 victim$ cat ~victim/.important-file
1197 If /tmp/parallel runs full during the run, Rust parallel does not
1198 report this, but finishes with success - thereby risking data loss.
1200 https://github.com/mmstick/parallel
1201 (Last checked: 2016-08)
1204 =head2 DIFFERENCES BETWEEN Rush AND GNU Parallel
1206 B<rush> (https://github.com/shenwei356/rush) is written in Go and
1207 based on B<gargs>.
1209 Just like GNU B<parallel> B<rush> buffers in temporary files. But
1210 opposite GNU B<parallel> B<rush> does not clean up, if the process
1211 dies abnormally.
1213 B<rush> has some string manipulations that can be emulated by putting
1214 this into ~/.parallel/config (/ is used instead of %, and % is used
1215 instead of ^ as that is closer to bash's ${var%postfix}):
1217 --rpl '{:} s:(\.[^/]+)*$::'
1218 --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
1219 --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
1220 --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
1221 --rpl '{@(.*?)} /$$1/ and $_=$1;'
1223 =head3 EXAMPLES FROM rush's WEBSITE
1225 Here are the examples from B<rush>'s website with the equivalent
1226 command in GNU B<parallel>.
1228 B<1. Simple run, quoting is not necessary>
1230 1$ seq 1 3 | rush echo {}
1232 1$ seq 1 3 | parallel echo {}
1234 B<2. Read data from file (`-i`)>
1236 2$ rush echo {} -i data1.txt -i data2.txt
1238 2$ cat data1.txt data2.txt | parallel echo {}
1240 B<3. Keep output order (`-k`)>
1242 3$ seq 1 3 | rush 'echo {}' -k
1244 3$ seq 1 3 | parallel -k echo {}
1247 B<4. Timeout (`-t`)>
1249 4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
1251 4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
1253 B<5. Retry (`-r`)>
1255 5$ seq 1 | rush 'python unexisted_script.py' -r 1
1257 5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
1259 Use B<-u> to see it is really run twice:
1261 5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
1263 B<6. Dirname (`{/}`) and basename (`{%}`) and remove custom
1264 suffix (`{^suffix}`)>
1266 6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
1268 6$ echo dir/file_1.txt.gz |
1269 parallel --plus echo {//} {/} {%_1.txt.gz}
1271 B<7. Get basename, and remove last (`{.}`) or any (`{:}`) extension>
1273 7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
1275 7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
1277 B<8. Job ID, combine fields index and other replacement strings>
1279 8$ echo 12 file.txt dir/s_1.fq.gz |
1280 rush 'echo job {#}: {2} {2.} {3%:^_1}'
1282 8$ echo 12 file.txt dir/s_1.fq.gz |
1283 parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1285 B<9. Capture submatch using regular expression (`{@regexp}`)>
1287 9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1289 9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1291 B<10. Custom field delimiter (`-d`)>
1293 10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1295 10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1297 B<11. Send multi-lines to every command (`-n`)>
1299 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1301 11$ seq 5 |
1302 parallel -n 2 -k \
1303 'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1305 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1307 11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1310 B<12. Custom record delimiter (`-D`), note that empty records are not used.>
1312 12$ echo a b c d | rush -D " " -k 'echo {}'
1314 12$ echo a b c d | parallel -d " " -k 'echo {}'
1316 12$ echo abcd | rush -D "" -k 'echo {}'
1318 Cannot be done by GNU Parallel
1320 12$ cat fasta.fa
1321 >seq1
1323 >seq2
1326 >seq3
1327 attac
1331 12$ cat fasta.fa | rush -D ">" \
1332 'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1333 # rush fails to join the multiline sequences
1335 12$ cat fasta.fa | (read -n1 ignore_first_char;
1336 parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1337 name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1340 B<13. Assign value to variable, like `awk -v` (`-v`)>
1342 13$ seq 1 |
1343 rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1345 13$ seq 1 |
1346 parallel -N0 \
1347 'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1349 13$ for var in a b; do \
1350 13$ seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1351 13$ done
1353 In GNU B<parallel> you would typically do:
1355 13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1357 If you I<really> want the var:
1359 13$ seq 1 3 |
1360 parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1362 If you I<really> want the B<for>-loop:
1364 13$ for var in a b; do
1365 export var;
1366 seq 1 3 | parallel -k 'echo var: $var, data: {}';
1367 done
1369 Contrary to B<rush> this also works if the value is complex like:
1371 My brother's 12" records
1374 B<14. Preset variable (`-v`), avoid repeatedly writing verbose replacement strings>
1376 14$ # naive way
1377 echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1379 14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1381 14$ # macro + removing suffix
1382 echo read_1.fq.gz |
1383 rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1385 14$ echo read_1.fq.gz |
1386 parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1388 14$ # macro + regular expression
1389 echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1391 14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1393 Contrary to B<rush> GNU B<parallel> works with complex values:
1395 14$ echo "My brother's 12\"read_1.fq.gz" |
1396 parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1398 B<15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and exit.>
1400 15$ seq 1 20 | rush 'sleep 1; echo {}'
1403 15$ seq 1 20 | parallel 'sleep 1; echo {}'
1406 B<16. Continue/resume jobs (`-c`). When some jobs failed (by
1407 execution failure, timeout, or canceling by user with `Ctrl + C`),
1408 please switch flag `-c/--continue` on and run again, so that `rush`
1409 can save successful commands and ignore them in I<NEXT> run.>
1411 16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1412 cat successful_cmds.rush
1413 seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1415 16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1416 'sleep {}; echo {}'
1417 cat mylog
1418 seq 1 3 | parallel --joblog mylog --retry-failed \
1419 'sleep {}; echo {}'
1421 Multi-line jobs:
1423 16$ seq 1 3 | rush 'sleep {}; echo {}; \
1424 echo finish {}' -t 3 -c -C finished.rush
1425 cat finished.rush
1426 seq 1 3 | rush 'sleep {}; echo {}; \
1427 echo finish {}' -t 3 -c -C finished.rush
1429 16$ seq 1 3 |
1430 parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1431 echo finish {}'
1432 cat mylog
1433 seq 1 3 |
1434 parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1435 echo finish {}'
1437 B<17. A comprehensive example: downloading 1K+ pages given by
1438 three URL list files using `phantomjs save_page.js` (some page
1439 contents are dynamically generated by Javascript, so `wget` does not
1440 work). Here I set max jobs number (`-j`) as `20`, each job has a max
1441 running time (`-t`) of `60` seconds and `3` retry changes
1442 (`-r`). Continue flag `-c` is also switched on, so we can continue
1443 unfinished jobs. Luckily, it's accomplished in one run :)>
1445 17$ for f in $(seq 2014 2016); do \
1446 /bin/rm -rf $f; mkdir -p $f; \
1447 cat $f.html.txt | rush -v d=$f -d = \
1448 'phantomjs save_page.js "{}" > {d}/{3}.html' \
1449 -j 20 -t 60 -r 3 -c; \
1450 done
1452 GNU B<parallel> can append to an existing joblog with '+':
1454 17$ rm mylog
1455 for f in $(seq 2014 2016); do
1456 /bin/rm -rf $f; mkdir -p $f;
1457 cat $f.html.txt |
1458 parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1459 --colsep = \
1460 phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1461 done
1463 B<18. A bioinformatics example: mapping with `bwa`, and
1464 processing result with `samtools`:>
1466 18$ ref=ref/xxx.fa
1467 threads=25
1468 ls -d raw.cluster.clean.mapping/* \
1469 | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1470 'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1471 samtools view -bS {p}.sam > {p}.bam; \
1472 samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1473 samtools index {p}.sorted.bam; \
1474 samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1475 /bin/rm {p}.bam {p}.sam;' \
1476 -j 2 --verbose -c -C mapping.rush
1478 GNU B<parallel> would use a function:
1480 18$ ref=ref/xxx.fa
1481 export ref
1482 thr=25
1483 export thr
1484 bwa_sam() {
1485 p="$1"
1486 bam="$p".bam
1487 sam="$p".sam
1488 sortbam="$p".sorted.bam
1489 bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1490 samtools view -bS "$sam" > "$bam"
1491 samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1492 samtools index "$sortbam"
1493 samtools flagstat "$sortbam" > "$sortbam".flagstat
1494 /bin/rm "$bam" "$sam"
1496 export -f bwa_sam
1497 ls -d raw.cluster.clean.mapping/* |
1498 parallel -j 2 --verbose --joblog mylog bwa_sam
1500 =head3 Other B<rush> features
1502 B<rush> has:
1504 =over 4
1506 =item * B<awk -v> like custom defined variables (B<-v>)
1508 With GNU B<parallel> you would simply set a shell variable:
1510 parallel 'v={}; echo "$v"' ::: foo
1511 echo foo | rush -v v={} 'echo {v}'
1513 Also B<rush> does not like special chars. So these B<do not work>:
1515 echo does not work | rush -v v=\" 'echo {v}'
1516 echo "My brother's 12\" records" | rush -v v={} 'echo {v}'
1518 Whereas the corresponding GNU B<parallel> version works:
1520 parallel 'v=\"; echo "$v"' ::: works
1521 parallel 'v={}; echo "$v"' ::: "My brother's 12\" records"
1523 =item * Exit on first error(s) (-e)
1525 This is called B<--halt now,fail=1> (or shorter: B<--halt 2>) when
1526 used with GNU B<parallel>.
1528 =item * Settable records sending to every command (B<-n>, default 1)
1530 This is also called B<-n> in GNU B<parallel>.
1532 =item * Practical replacement strings
1534 =over 4
1536 =item {:} remove any extension
1538 With GNU B<parallel> this can be emulated by:
1540 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1542 =item {^suffix}, remove suffix
1544 With GNU B<parallel> this can be emulated by:
1546 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1548 =item {@regexp}, capture submatch using regular expression
1550 With GNU B<parallel> this can be emulated by:
1552 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1553 echo '{@\d_(.*).gz}' ::: 1_foo.gz
1555 =item {%.}, {%:}, basename without extension
1557 With GNU B<parallel> this can be emulated by:
1559 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1561 And if you need it often, you define a B<--rpl> in
1562 B<$HOME/.parallel/config>:
1564 --rpl '{%.} s:.*/::;s/\..*//'
1565 --rpl '{%:} s:.*/::;s/\..*//'
1567 Then you can use them as:
1569 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1571 =back
1573 =item * Preset variable (macro)
1575 E.g.
1577 echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1579 With GNU B<parallel> this can be emulated by:
1581 echo foosuffix |
1582 parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1584 Opposite B<rush> GNU B<parallel> works fine if the input contains
1585 double space, ' and ":
1587 echo "1'6\" foosuffix" |
1588 parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1591 =item * Commands of multi-lines
1593 While you I<can> use multi-lined commands in GNU B<parallel>, to
1594 improve readability GNU B<parallel> discourages the use of multi-line
1595 commands. In most cases it can be written as a function:
1597 seq 1 3 |
1598 parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1599 echo finish {}'
1601 Could be written as:
1603 doit() {
1604 sleep "$1"
1605 echo "$1"
1606 echo finish "$1"
1608 export -f doit
1609 seq 1 3 | parallel --timeout 2 --joblog my.log doit
1611 The failed commands can be resumed with:
1613 seq 1 3 |
1614 parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1615 echo finish {}'
1617 =back
1619 https://github.com/shenwei356/rush
1620 (Last checked: 2017-05)
1623 =head2 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1625 ClusterSSH solves a different problem than GNU B<parallel>.
1627 ClusterSSH opens a terminal window for each computer and using a
1628 master window you can run the same command on all the computers. This
1629 is typically used for administrating several computers that are almost
1630 identical.
1632 GNU B<parallel> runs the same (or different) commands with different
1633 arguments in parallel possibly using remote computers to help
1634 computing. If more than one computer is listed in B<-S> GNU B<parallel> may
1635 only use one of these (e.g. if there are 8 jobs to be run and one
1636 computer has 8 cores).
1638 GNU B<parallel> can be used as a poor-man's version of ClusterSSH:
1640 B<parallel --nonall -S server-a,server-b do_stuff foo bar>
1642 https://github.com/duncs/clusterssh
1643 (Last checked: 2010-12)
1646 =head2 DIFFERENCES BETWEEN coshell AND GNU Parallel
1648 B<coshell> only accepts full commands on standard input. Any quoting
1649 needs to be done by the user.
1651 Commands are run in B<sh> so any B<bash>/B<tcsh>/B<zsh> specific
1652 syntax will not work.
1654 Output can be buffered by using B<-d>. Output is buffered in memory,
1655 so big output can cause swapping and therefore be terrible slow or
1656 even cause out of memory.
1658 https://github.com/gdm85/coshell
1659 (Last checked: 2019-01)
1662 =head2 DIFFERENCES BETWEEN spread AND GNU Parallel
1664 B<spread> runs commands on all directories.
1666 It can be emulated with GNU B<parallel> using this Bash function:
1668 spread() {
1669 _cmds() {
1670 perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1672 parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1675 This works except for the B<--exclude> option.
1677 (Last checked: 2017-11)
1680 =head2 DIFFERENCES BETWEEN pyargs AND GNU Parallel
1682 B<pyargs> deals badly with input containing spaces. It buffers stdout,
1683 but not stderr. It buffers in RAM. {} does not work as replacement
1684 string. It does not support running functions.
1686 B<pyargs> does not support composed commands if run with B<--lines>,
1687 and fails on B<pyargs traceroute gnu.org fsf.org>.
1689 =head3 Examples
1691 seq 5 | pyargs -P50 -L seq
1692 seq 5 | parallel -P50 --lb seq
1694 seq 5 | pyargs -P50 --mark -L seq
1695 seq 5 | parallel -P50 --lb \
1696 --tagstring OUTPUT'[{= $_=$job->replaced() =}]' seq
1697 # Similar, but not precisely the same
1698 seq 5 | parallel -P50 --lb --tag seq
1700 seq 5 | pyargs -P50 --mark command
1701 # Somewhat longer with GNU Parallel due to the special
1702 # --mark formatting
1703 cmd="$(echo "command" | parallel --shellquote)"
1704 wrap_cmd() {
1705 echo "MARK $cmd $@================================" >&3
1706 echo "OUTPUT START[$cmd $@]:"
1707 eval $cmd "$@"
1708 echo "OUTPUT END[$cmd $@]"
1710 (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1711 # Similar, but not exactly the same
1712 seq 5 | parallel -t --tag command
1714 (echo '1 2 3';echo 4 5 6) | pyargs --stream seq
1715 (echo '1 2 3';echo 4 5 6) | perl -pe 's/\n/ /' |
1716 parallel -r -d' ' seq
1717 # Similar, but not exactly the same
1718 parallel seq ::: 1 2 3 4 5 6
1720 https://github.com/robertblackwell/pyargs
1721 (Last checked: 2019-01)
1724 =head2 DIFFERENCES BETWEEN concurrently AND GNU Parallel
1726 B<concurrently> runs jobs in parallel.
1728 The output is prepended with the job number, and may be incomplete:
1730 $ concurrently 'seq 100000' | (sleep 3;wc -l)
1731 7165
1733 When pretty printing it caches output in memory. Output mixes by using
1734 test MIX below whether or not output is cached.
1736 There seems to be no way of making a template command and have
1737 B<concurrently> fill that with different args. The full commands must
1738 be given on the command line.
1740 There is also no way of controlling how many jobs should be run in
1741 parallel at a time - i.e. "number of jobslots". Instead all jobs are
1742 simply started in parallel.
1744 https://github.com/kimmobrunfeldt/concurrently
1745 (Last checked: 2019-01)
1748 =head2 DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1750 B<map> does not run jobs in parallel by default. The README suggests using:
1752 ... | map t 'sleep $t && say done &'
1754 But this fails if more jobs are run in parallel than the number of
1755 available processes. Since there is no support for parallelization in
1756 B<map> itself, the output also mixes:
1758 seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1760 The major difference is that GNU B<parallel> is built for parallelization
1761 and B<map> is not. So GNU B<parallel> has lots of ways of dealing with the
1762 issues that parallelization raises:
1764 =over 4
1766 =item *
1768 Keep the number of processes manageable
1770 =item *
1772 Make sure output does not mix
1774 =item *
1776 Make Ctrl-C kill all running processes
1778 =back
1780 =head3 EXAMPLES FROM maps WEBSITE
1782 Here are the 5 examples converted to GNU Parallel:
1784 1$ ls *.c | map f 'foo $f'
1785 1$ ls *.c | parallel foo
1787 2$ ls *.c | map f 'foo $f; bar $f'
1788 2$ ls *.c | parallel 'foo {}; bar {}'
1790 3$ cat urls | map u 'curl -O $u'
1791 3$ cat urls | parallel curl -O
1793 4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1794 4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1795 4$ parallel 'sleep {} && say done' ::: 1 1 1
1797 5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1798 5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1799 5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1801 https://github.com/soveran/map
1802 (Last checked: 2019-01)
1805 =head2 DIFFERENCES BETWEEN loop AND GNU Parallel
1807 B<loop> mixes stdout and stderr:
1809 loop 'ls /no-such-file' >/dev/null
1811 B<loop>'s replacement string B<$ITEM> does not quote strings:
1813 echo 'two spaces' | loop 'echo $ITEM'
1815 B<loop> cannot run functions:
1817 myfunc() { echo joe; }
1818 export -f myfunc
1819 loop 'myfunc this fails'
1821 =head3 EXAMPLES FROM loop's WEBSITE
1823 Some of the examples from https://github.com/Miserlou/Loop/ can be
1824 emulated with GNU B<parallel>:
1826 # A couple of functions will make the code easier to read
1827 $ loopy() {
1828 yes | parallel -uN0 -j1 "$@"
1830 $ export -f loopy
1831 $ time_out() {
1832 parallel -uN0 -q --timeout "$@" ::: 1
1834 $ match() {
1835 perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1837 $ export -f match
1839 $ loop 'ls' --every 10s
1840 $ loopy --delay 10s ls
1842 $ loop 'touch $COUNT.txt' --count-by 5
1843 $ loopy touch '{= $_=seq()*5 =}'.txt
1845 $ loop --until-contains 200 -- \
1846 ./get_response_code.sh --site mysite.biz`
1847 $ loopy --halt now,success=1 \
1848 './get_response_code.sh --site mysite.biz | match 200'
1850 $ loop './poke_server' --for-duration 8h
1851 $ time_out 8h loopy ./poke_server
1853 $ loop './poke_server' --until-success
1854 $ loopy --halt now,success=1 ./poke_server
1856 $ cat files_to_create.txt | loop 'touch $ITEM'
1857 $ cat files_to_create.txt | parallel touch {}
1859 $ loop 'ls' --for-duration 10min --summary
1860 # --joblog is somewhat more verbose than --summary
1861 $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1863 $ loop 'echo hello'
1864 $ loopy echo hello
1866 $ loop 'echo $COUNT'
1867 # GNU Parallel counts from 1
1868 $ loopy echo {#}
1869 # Counting from 0 can be forced
1870 $ loopy echo '{= $_=seq()-1 =}'
1872 $ loop 'echo $COUNT' --count-by 2
1873 $ loopy echo '{= $_=2*(seq()-1) =}'
1875 $ loop 'echo $COUNT' --count-by 2 --offset 10
1876 $ loopy echo '{= $_=10+2*(seq()-1) =}'
1878 $ loop 'echo $COUNT' --count-by 1.1
1879 # GNU Parallel rounds 3.3000000000000003 to 3.3
1880 $ loopy echo '{= $_=1.1*(seq()-1) =}'
1882 $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1883 $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1885 $ loop 'echo $COUNT' --num 3 --summary
1886 # --joblog is somewhat more verbose than --summary
1887 $ seq 3 | parallel --joblog my.log echo; cat my.log
1889 $ loop 'ls -foobarbatz' --num 3 --summary
1890 # --joblog is somewhat more verbose than --summary
1891 $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1893 $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1894 # Can be emulated by running 2 jobs
1895 $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1896 $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1898 $ loop 'date' --every 5s
1899 $ loopy --delay 5s date
1901 $ loop 'date' --for-duration 8s --every 2s
1902 $ time_out 8s loopy --delay 2s date
1904 $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1905 $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date +%s`))s
1906 $ time_out $seconds loopy --delay 5s date -u
1908 $ loop 'echo $RANDOM' --until-contains "666"
1909 $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1911 $ loop 'if (( RANDOM % 2 )); then
1912 (echo "TRUE"; true);
1913 else
1914 (echo "FALSE"; false);
1915 fi' --until-success
1916 $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1917 (echo "TRUE"; true);
1918 else
1919 (echo "FALSE"; false);
1922 $ loop 'if (( RANDOM % 2 )); then
1923 (echo "TRUE"; true);
1924 else
1925 (echo "FALSE"; false);
1926 fi' --until-error
1927 $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1928 (echo "TRUE"; true);
1929 else
1930 (echo "FALSE"; false);
1933 $ loop 'date' --until-match "(\d{4})"
1934 $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1936 $ loop 'echo $ITEM' --for red,green,blue
1937 $ parallel echo ::: red green blue
1939 $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1940 $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1942 $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1943 $ ls | parallel cp {} {}.bak; ls
1945 $ loop 'echo $ITEM | tr a-z A-Z' -i
1946 $ parallel 'echo {} | tr a-z A-Z'
1947 # Or more efficiently:
1948 $ parallel --pipe tr a-z A-Z
1950 $ loop 'echo $ITEM' --for "`ls`"
1951 $ parallel echo {} ::: "`ls`"
1953 $ ls | loop './my_program $ITEM' --until-success;
1954 $ ls | parallel --halt now,success=1 ./my_program {}
1956 $ ls | loop './my_program $ITEM' --until-fail;
1957 $ ls | parallel --halt now,fail=1 ./my_program {}
1959 $ ./deploy.sh;
1960 loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1961 --every 5s --until-contains 200;
1962 ./announce_to_slack.sh
1963 $ ./deploy.sh;
1964 loopy --delay 5s --halt now,success=1 \
1965 'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1966 ./announce_to_slack.sh
1968 $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1969 $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1971 $ ./create_big_file -o my_big_file.bin;
1972 loop 'ls' --until-contains 'my_big_file.bin';
1973 ./upload_big_file my_big_file.bin
1974 # inotifywait is a better tool to detect file system changes.
1975 # It can even make sure the file is complete
1976 # so you are not uploading an incomplete file
1977 $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1978 grep my_big_file.bin
1980 $ ls | loop 'cp $ITEM $ITEM.bak'
1981 $ ls | parallel cp {} {}.bak
1983 $ loop './do_thing.sh' --every 15s --until-success --num 5
1984 $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
1986 https://github.com/Miserlou/Loop/
1987 (Last checked: 2018-10)
1990 =head2 DIFFERENCES BETWEEN lorikeet AND GNU Parallel
1992 B<lorikeet> can run jobs in parallel. It does this based on a
1993 dependency graph described in a file, so this is similar to B<make>.
1995 https://github.com/cetra3/lorikeet
1996 (Last checked: 2018-10)
1999 =head2 DIFFERENCES BETWEEN spp AND GNU Parallel
2001 B<spp> can run jobs in parallel. B<spp> does not use a command
2002 template to generate the jobs, but requires jobs to be in a
2003 file. Output from the jobs mix.
2005 https://github.com/john01dav/spp
2006 (Last checked: 2019-01)
2009 =head2 DIFFERENCES BETWEEN paral AND GNU Parallel
2011 B<paral> prints a lot of status information and stores the output from
2012 the commands run into files. This means it cannot be used the middle
2013 of a pipe like this
2015 paral "echo this" "echo does not" "echo work" | wc
2017 Instead it puts the output into files named like
2018 B<out_#_I<command>.out.log>. To get a very similar behaviour with GNU
2019 B<parallel> use B<--results
2020 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta>
2022 B<paral> only takes arguments on the command line and each argument
2023 should be a full command. Thus it does not use command templates.
2025 This limits how many jobs it can run in total, because they all need
2026 to fit on a single command line.
2028 B<paral> has no support for running jobs remotely.
2030 =head3 EXAMPLES FROM README.markdown
2032 The examples from B<README.markdown> and the corresponding command run
2033 with GNU B<parallel> (B<--results
2034 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta> is omitted from
2035 the GNU B<parallel> command):
2037 1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
2038 1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
2040 2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
2041 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
2042 2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
2043 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
2044 # Or shorter:
2045 parallel "sleep {} && echo c{}" ::: {1..5}
2047 3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2048 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2049 3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
2050 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2051 # Or shorter:
2052 parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
2054 4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2055 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2056 4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
2058 5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2059 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2060 5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
2062 6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2063 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2064 6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
2066 7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2067 echo c && sleep 0.5 && echo d && sleep 0.5 && \
2068 echo e && sleep 0.5 && echo f && sleep 0.5 && \
2069 echo g && sleep 0.5 && echo h"
2070 7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2071 echo c && sleep 0.5 && echo d && sleep 0.5 && \
2072 echo e && sleep 0.5 && echo f && sleep 0.5 && \
2073 echo g && sleep 0.5 && echo h"
2075 https://github.com/amattn/paral
2076 (Last checked: 2019-01)
2079 =head2 DIFFERENCES BETWEEN concurr AND GNU Parallel
2081 B<concurr> is built to run jobs in parallel using a client/server
2082 model.
2084 =head3 EXAMPLES FROM README.md
2086 The examples from B<README.md>:
2088 1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
2089 1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
2091 2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
2092 2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
2094 3$ concurr 'echo {}' < input_file
2095 3$ parallel 'echo {}' < input_file
2097 4$ cat file | concurr 'echo {}'
2098 4$ cat file | parallel 'echo {}'
2100 B<concurr> deals badly empty input files and with output larger than
2101 64 KB.
2103 https://github.com/mmstick/concurr
2104 (Last checked: 2019-01)
2107 =head2 DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
2109 B<lesser-parallel> is the inspiration for B<parallel --embed>. Both
2110 B<lesser-parallel> and B<parallel --embed> define bash functions that
2111 can be included as part of a bash script to run jobs in parallel.
2113 B<lesser-parallel> implements a few of the replacement strings, but
2114 hardly any options, whereas B<parallel --embed> gives you the full
2115 GNU B<parallel> experience.
2117 https://github.com/kou1okada/lesser-parallel
2118 (Last checked: 2019-01)
2121 =head2 DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
2123 B<npm-parallel> can run npm tasks in parallel.
2125 There are no examples and very little documentation, so it is hard to
2126 compare to GNU B<parallel>.
2128 https://github.com/spion/npm-parallel
2129 (Last checked: 2019-01)
2132 =head2 DIFFERENCES BETWEEN machma AND GNU Parallel
2134 B<machma> runs tasks in parallel. It gives time stamped
2135 output. It buffers in RAM.
2137 =head3 EXAMPLES FROM README.md
2139 The examples from README.md:
2141 1$ # Put shorthand for timestamp in config for the examples
2142 echo '--rpl '\
2143 \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
2144 > ~/.parallel/machma
2145 echo '--line-buffer --tagstring "{#} {time} {}"' \
2146 >> ~/.parallel/machma
2148 2$ find . -iname '*.jpg' |
2149 machma -- mogrify -resize 1200x1200 -filter Lanczos {}
2150 find . -iname '*.jpg' |
2151 parallel --bar -Jmachma mogrify -resize 1200x1200 \
2152 -filter Lanczos {}
2154 3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
2155 3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
2157 4$ cat /tmp/ips |
2158 machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
2159 4$ cat /tmp/ips |
2160 parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
2162 5$ find . -iname '*.jpg' |
2163 machma --timeout 5s -- mogrify -resize 1200x1200 \
2164 -filter Lanczos {}
2165 5$ find . -iname '*.jpg' |
2166 parallel --timeout 5s --bar mogrify -resize 1200x1200 \
2167 -filter Lanczos {}
2169 6$ find . -iname '*.jpg' -print0 |
2170 machma --null -- mogrify -resize 1200x1200 -filter Lanczos {}
2171 6$ find . -iname '*.jpg' -print0 |
2172 parallel --null --bar mogrify -resize 1200x1200 \
2173 -filter Lanczos {}
2175 https://github.com/fd0/machma
2176 (Last checked: 2019-06)
2179 =head2 DIFFERENCES BETWEEN interlace AND GNU Parallel
2181 Summary (see legend above):
2183 =over
2185 =item - I2 I3 I4 - - -
2187 =item M1 - M3 - - M6
2189 =item - O2 O3 - - - - x x
2191 =item E1 E2 - - - - -
2193 =item - - - - - - - - -
2195 =item - -
2197 =back
2199 B<interlace> is built for network analysis to run network tools in parallel.
2201 B<interface> does not buffer output, so output from different jobs mixes.
2203 The overhead for each target is O(n*n), so with 1000 targets it
2204 becomes very slow with an overhead in the order of 500ms/target.
2206 =head3 EXAMPLES FROM interlace's WEBSITE
2208 Using B<prips> most of the examples from
2209 https://github.com/codingo/Interlace can be run with GNU B<parallel>:
2211 Blocker
2213 commands.txt:
2214 mkdir -p _output_/_target_/scans/
2215 _blocker_
2216 nmap _target_ -oA _output_/_target_/scans/_target_-nmap
2217 interlace -tL ./targets.txt -cL commands.txt -o $output
2219 parallel -a targets.txt \
2220 mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
2222 Blocks
2224 commands.txt:
2225 _block:nmap_
2226 mkdir -p _target_/output/scans/
2227 nmap _target_ -oN _target_/output/scans/_target_-nmap
2228 _block:nmap_
2229 nikto --host _target_
2230 interlace -tL ./targets.txt -cL commands.txt
2232 _nmap() {
2233 mkdir -p $1/output/scans/
2234 nmap $1 -oN $1/output/scans/$1-nmap
2236 export -f _nmap
2237 parallel ::: _nmap "nikto --host" :::: targets.txt
2239 Run Nikto Over Multiple Sites
2241 interlace -tL ./targets.txt -threads 5 \
2242 -c "nikto --host _target_ > ./_target_-nikto.txt" -v
2244 parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
2246 Run Nikto Over Multiple Sites and Ports
2248 interlace -tL ./targets.txt -threads 5 -c \
2249 "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
2250 -p 80,443 -v
2252 parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
2253 :::: targets.txt ::: 80 443
2255 Run a List of Commands against Target Hosts
2257 commands.txt:
2258 nikto --host _target_:_port_ > _output_/_target_-nikto.txt
2259 sslscan _target_:_port_ > _output_/_target_-sslscan.txt
2260 testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
2261 interlace -t example.com -o ~/Engagements/example/ \
2262 -cL ./commands.txt -p 80,443
2264 parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
2265 ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
2267 CIDR notation with an application that doesn't support it
2269 interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
2270 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2272 prips 192.168.12.0/24 |
2273 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2275 Glob notation with an application that doesn't support it
2277 interlace -t 192.168.12.* -c "vhostscan _target_ \
2278 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2280 # Glob is not supported in prips
2281 prips 192.168.12.0/24 |
2282 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2284 Dash (-) notation with an application that doesn't support it
2286 interlace -t 192.168.12.1-15 -c \
2287 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2288 -o ~/scans/ -threads 50
2290 # Dash notation is not supported in prips
2291 prips 192.168.12.1 192.168.12.15 |
2292 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2294 Threading Support for an application that doesn't support it
2296 interlace -tL ./target-list.txt -c \
2297 "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
2298 -o ~/scans/ -threads 50
2300 cat ./target-list.txt |
2301 parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
2303 alternatively
2305 ./vhosts-commands.txt:
2306 vhostscan -t $target -oN _output_/_target_-vhosts.txt
2307 interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
2308 -threads 50 -o ~/scans
2310 ./vhosts-commands.txt:
2311 vhostscan -t "$1" -oN "$2"
2312 parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
2313 :::: ./target-list.txt
2315 Exclusions
2317 interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
2318 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2319 -o ~/scans/ -threads 50
2321 prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
2322 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2324 Run Nikto Using Multiple Proxies
2326 interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
2327 "nikto --host _target_:_port_ -useproxy _proxy_ > \
2328 ./_target_-_port_-nikto.txt" -p 80,443 -v
2330 parallel -j5 \
2331 "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
2332 :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
2334 https://github.com/codingo/Interlace
2335 (Last checked: 2019-09)
2338 =head2 DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
2340 I have been unable to get the code to run at all. It seems unfinished.
2342 https://github.com/otonvm/Parallel
2343 (Last checked: 2019-02)
2346 =head2 DIFFERENCES BETWEEN k-bx par AND GNU Parallel
2348 B<par> requires Haskell to work. This limits the number of platforms
2349 this can work on.
2351 B<par> does line buffering in memory. The memory usage is 3x the
2352 longest line (compared to 1x for B<parallel --lb>). Commands must be
2353 given as arguments. There is no template.
2355 These are the examples from https://github.com/k-bx/par with the
2356 corresponding GNU B<parallel> command.
2358 par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2359 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2360 parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2361 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2363 par "echo foo; sleep 1; foofoo" \
2364 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2365 parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
2366 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2368 par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
2369 parallel --lb --colsep , --tagstring {1} {2} \
2370 ::: "[fooechoer],echo foo" "[bar],echo bar"
2372 par --succeed "foo" "bar" && echo 'wow'
2373 parallel "foo" "bar"; true && echo 'wow'
2375 https://github.com/k-bx/par
2376 (Last checked: 2019-02)
2378 =head2 DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2380 B<parallelshell> does not allow for composed commands:
2382 # This does not work
2383 parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2385 Instead you have to wrap that in a shell:
2387 parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2389 It buffers output in RAM. All commands must be given on the command
2390 line and all commands are started in parallel at the same time. This
2391 will cause the system to freeze if there are so many jobs that there
2392 is not enough memory to run them all at the same time.
2394 https://github.com/keithamus/parallelshell
2395 (Last checked: 2019-02)
2397 https://github.com/darkguy2008/parallelshell
2398 (Last checked: 2019-03)
2401 =head2 DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2403 B<shell-executor> does not allow for composed commands:
2405 # This does not work
2406 sx 'echo foo;echo bar' 'echo baz;echo quuz'
2408 Instead you have to wrap that in a shell:
2410 sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2412 It buffers output in RAM. All commands must be given on the command
2413 line and all commands are started in parallel at the same time. This
2414 will cause the system to freeze if there are so many jobs that there
2415 is not enough memory to run them all at the same time.
2417 https://github.com/royriojas/shell-executor
2418 (Last checked: 2019-02)
2421 =head2 DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2423 B<par> buffers in memory to avoid mixing of jobs. It takes 1s per 1
2424 million output lines.
2426 B<par> needs to have all commands before starting the first job. The
2427 jobs are read from stdin (standard input) so any quoting will have to
2428 be done by the user.
2430 Stdout (standard output) is prepended with o:. Stderr (standard error)
2431 is sendt to stdout (standard output) and prepended with e:.
2433 For short jobs with little output B<par> is 20% faster than GNU
2434 B<parallel> and 60% slower than B<xargs>.
2436 https://github.com/UnixJunkie/PAR
2438 https://savannah.nongnu.org/projects/par
2439 (Last checked: 2019-02)
2442 =head2 DIFFERENCES BETWEEN fd AND GNU Parallel
2444 B<fd> does not support composed commands, so commands must be wrapped
2445 in B<sh -c>.
2447 It buffers output in RAM.
2449 It only takes file names from the filesystem as input (similar to B<find>).
2451 https://github.com/sharkdp/fd
2452 (Last checked: 2019-02)
2455 =head2 DIFFERENCES BETWEEN lateral AND GNU Parallel
2457 B<lateral> is very similar to B<sem>: It takes a single command and
2458 runs it in the background. The design means that output from parallel
2459 running jobs may mix. If it dies unexpectly it leaves a socket in
2460 ~/.lateral/socket.PID.
2462 B<lateral> deals badly with too long command lines. This makes the
2463 B<lateral> server crash:
2465 lateral run echo `seq 100000| head -c 1000k`
2467 Any options will be read by B<lateral> so this does not work
2468 (B<lateral> interprets the B<-l>):
2470 lateral run ls -l
2472 Composed commands do not work:
2474 lateral run pwd ';' ls
2476 Functions do not work:
2478 myfunc() { echo a; }
2479 export -f myfunc
2480 lateral run myfunc
2482 Running B<emacs> in the terminal causes the parent shell to die:
2484 echo '#!/bin/bash' > mycmd
2485 echo emacs -nw >> mycmd
2486 chmod +x mycmd
2487 lateral start
2488 lateral run ./mycmd
2490 Here are the examples from https://github.com/akramer/lateral with the
2491 corresponding GNU B<sem> and GNU B<parallel> commands:
2493 1$ lateral start
2494 for i in $(cat /tmp/names); do
2495 lateral run -- some_command $i
2496 done
2497 lateral wait
2499 1$ for i in $(cat /tmp/names); do
2500 sem some_command $i
2501 done
2502 sem --wait
2504 1$ parallel some_command :::: /tmp/names
2506 2$ lateral start
2507 for i in $(seq 1 100); do
2508 lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2509 done
2510 lateral wait
2512 2$ for i in $(seq 1 100); do
2513 sem my_slow_command < workfile$i > /tmp/logfile$i
2514 done
2515 sem --wait
2517 2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2518 ::: {1..100}
2520 3$ lateral start -p 0 # yup, it will just queue tasks
2521 for i in $(seq 1 100); do
2522 lateral run -- command_still_outputs_but_wont_spam inputfile$i
2523 done
2524 # command output spam can commence
2525 lateral config -p 10; lateral wait
2527 3$ for i in $(seq 1 100); do
2528 echo "command inputfile$i" >> joblist
2529 done
2530 parallel -j 10 :::: joblist
2532 3$ echo 1 > /tmp/njobs
2533 parallel -j /tmp/njobs command inputfile{} \
2534 ::: {1..100} &
2535 echo 10 >/tmp/njobs
2536 wait
2538 https://github.com/akramer/lateral
2539 (Last checked: 2019-03)
2542 =head2 DIFFERENCES BETWEEN with-this AND GNU Parallel
2544 The examples from https://github.com/amritb/with-this.git and the
2545 corresponding GNU B<parallel> command:
2547 with -v "$(cat myurls.txt)" "curl -L this"
2548 parallel curl -L ::: myurls.txt
2550 with -v "$(cat myregions.txt)" \
2551 "aws --region=this ec2 describe-instance-status"
2552 parallel aws --region={} ec2 describe-instance-status \
2553 :::: myregions.txt
2555 with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2556 ls | parallel kubectl --kubeconfig={} get pods
2558 with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2559 ls | grep config | parallel kubectl --kubeconfig={} get pods
2561 with -v "$(echo {1..10})" "echo 123"
2562 parallel -N0 echo 123 ::: {1..10}
2564 Stderr is merged with stdout. B<with-this> buffers in RAM. It uses 3x
2565 the output size, so you cannot have output larger than 1/3rd the
2566 amount of RAM. The input values cannot contain spaces. Composed
2567 commands do not work.
2569 B<with-this> gives some additional information, so the output has to
2570 be cleaned before piping it to the next command.
2572 https://github.com/amritb/with-this.git
2573 (Last checked: 2019-03)
2576 =head2 DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2578 Summary (see legend above):
2580 =over
2582 =item - - - I4 - - I7
2584 =item - - M3 - - M6
2586 =item - O2 O3 - O5 O6 - x x
2588 =item E1 - - - - - E7
2590 =item - x x x x x x x x
2592 =item - -
2594 =back
2596 =head3 EXAMPLES FROM Tollef's parallel MANUAL
2598 B<Tollef> parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2600 B<GNU> parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2602 B<Tollef> parallel -j 3 ufraw -o processed -- *.NEF
2604 B<GNU> parallel -j 3 ufraw -o processed ::: *.NEF
2606 B<Tollef> parallel -j 3 -- ls df "echo hi"
2608 B<GNU> parallel -j 3 ::: ls df "echo hi"
2610 (Last checked: 2019-08)
2612 =head2 DIFFERENCES BETWEEN rargs AND GNU Parallel
2614 Summary (see legend above):
2616 =over
2618 =item I1 - - - - - I7
2620 =item - - M3 M4 - -
2622 =item - O2 O3 - O5 O6 - O8 -
2624 =item E1 - - E4 - - -
2626 =item - - - - - - - - -
2628 =item - -
2630 =back
2632 B<rargs> has elegant ways of doing named regexp capture and field ranges.
2634 With GNU B<parallel> you can use B<--rpl> to get a similar
2635 functionality as regexp capture gives, and use B<join> and B<@arg> to
2636 get the field ranges. But the syntax is longer. This:
2638 --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2640 would make it possible to use:
2642 {1r3..6}
2644 for field 3..6.
2646 For full support of {n..m:s} including negative numbers use a dynamic
2647 replacement string like this:
2650 PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2651 $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2652 $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2653 $s = defined $$6 ? $$7 : " ";
2654 $_ = join $s,@arg[$a..$b]'\'
2655 export PARALLEL
2657 You can then do:
2659 head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2660 4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2661 ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2663 =head3 EXAMPLES FROM rargs MANUAL
2665 1$ ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2667 1$ ls *.bak | parallel mv {} {.}
2669 2$ cat download-list.csv |
2670 rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2672 2$ cat download-list.csv |
2673 parallel --csv wget {1} -O {2}
2674 # or use regexps:
2675 2$ cat download-list.csv |
2676 parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' \
2677 wget {url} -O {filename}
2679 3$ cat /etc/passwd |
2680 rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2682 3$ cat /etc/passwd |
2683 parallel -q --colsep : \
2684 echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2686 https://github.com/lotabout/rargs
2687 (Last checked: 2020-01)
2690 =head2 DIFFERENCES BETWEEN threader AND GNU Parallel
2692 Summary (see legend above):
2694 =over
2696 =item I1 - - - - - -
2698 =item M1 - M3 - - M6
2700 =item O1 - O3 - O5 - - x x
2702 =item E1 - - E4 - - -
2704 =item - - - - - - - - -
2706 =item - -
2708 =back
2710 Newline separates arguments, but newline at the end of file is treated
2711 as an empty argument. So this runs 2 jobs:
2713 echo two_jobs | threader -run 'echo "$THREADID"'
2715 B<threader> ignores stderr, so any output to stderr is
2716 lost. B<threader> buffers in RAM, so output bigger than the machine's
2717 virtual memory will cause the machine to crash.
2719 https://github.com/voodooEntity/threader
2720 (Last checked: 2020-04)
2723 =head2 DIFFERENCES BETWEEN runp AND GNU Parallel
2725 Summary (see legend above):
2727 =over
2729 =item I1 I2 - - - - -
2731 =item M1 - (M3) - - M6
2733 =item O1 O2 O3 - O5 O6 - x x -
2735 =item E1 - - - - - -
2737 =item - - - - - - - - -
2739 =item - -
2741 =back
2743 (M3): You can add a prefix and a postfix to the input, so it means you can
2744 only insert the argument on the command line once.
2746 B<runp> runs 10 jobs in parallel by default. B<runp> blocks if output
2747 of a command is > 64 Kbytes. Quoting of input is needed. It adds
2748 output to stderr (this can be prevented with -q)
2750 =head3 Examples as GNU Parallel
2752 base='https://images-api.nasa.gov/search'
2753 query='jupiter'
2754 desc='planet'
2755 type='image'
2756 url="$base?q=$query&description=$desc&media_type=$type"
2758 # Download the images in parallel using runp
2759 curl -s $url | jq -r .collection.items[].href | \
2760 runp -p 'curl -s' | jq -r .[] | grep large | \
2761 runp -p 'curl -s -L -O'
2763 time curl -s $url | jq -r .collection.items[].href | \
2764 runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2765 runp -g 1 -q -p 'curl -s -L -O'
2767 # Download the images in parallel
2768 curl -s $url | jq -r .collection.items[].href | \
2769 parallel curl -s | jq -r .[] | grep large | \
2770 parallel curl -s -L -O
2772 time curl -s $url | jq -r .collection.items[].href | \
2773 parallel -j 1 curl -s | jq -r .[] | grep large | \
2774 parallel -j 1 curl -s -L -O
2777 =head4 Run some test commands (read from file)
2779 # Create a file containing commands to run in parallel.
2780 cat << EOF > /tmp/test-commands.txt
2781 sleep 5
2782 sleep 3
2783 blah # this will fail
2784 ls $PWD # PWD shell variable is used here
2787 # Run commands from the file.
2788 runp /tmp/test-commands.txt > /dev/null
2790 parallel -a /tmp/test-commands.txt > /dev/null
2792 =head4 Ping several hosts and see packet loss (read from stdin)
2794 # First copy this line and press Enter
2795 runp -p 'ping -c 5 -W 2' -s '| grep loss'
2796 localhost
2797 1.1.1.1
2798 8.8.8.8
2799 # Press Enter and Ctrl-D when done entering the hosts
2801 # First copy this line and press Enter
2802 parallel ping -c 5 -W 2 {} '| grep loss'
2803 localhost
2804 1.1.1.1
2805 8.8.8.8
2806 # Press Enter and Ctrl-D when done entering the hosts
2808 =head4 Get directories' sizes (read from stdin)
2810 echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2812 echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2813 # or:
2814 parallel sudo du -sh ::: "$HOME" /etc /tmp
2816 =head4 Compress files
2818 find . -iname '*.txt' | runp -p 'gzip --best'
2820 find . -iname '*.txt' | parallel gzip --best
2822 =head4 Measure HTTP request + response time
2824 export CURL="curl -w 'time_total: %{time_total}\n'"
2825 CURL="$CURL -o /dev/null -s https://golang.org/"
2826 perl -wE 'for (1..10) { say $ENV{CURL} }' |
2827 runp -q # Make 10 requests
2829 perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2830 # or:
2831 parallel -N0 "$CURL" ::: {1..10}
2833 =head4 Find open TCP ports
2835 cat << EOF > /tmp/host-port.txt
2836 localhost 22
2837 localhost 80
2838 localhost 81
2839 127.0.0.1 443
2840 127.0.0.1 444
2841 scanme.nmap.org 22
2842 scanme.nmap.org 23
2843 scanme.nmap.org 443
2846 1$ cat /tmp/host-port.txt |
2847 runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2849 # --colsep is needed to split the line
2850 1$ cat /tmp/host-port.txt |
2851 parallel --colsep ' ' netcat -v -w2 -z 2>&1 |
2852 egrep '(succeeded!|open)$'
2853 # or use uq for unquoted:
2854 1$ cat /tmp/host-port.txt |
2855 parallel netcat -v -w2 -z {=uq=} 2>&1 |
2856 egrep '(succeeded!|open)$'
2858 https://github.com/jreisinger/runp
2859 (Last checked: 2020-04)
2862 =head2 DIFFERENCES BETWEEN papply AND GNU Parallel
2864 Summary (see legend above):
2866 =over
2868 =item - - - I4 - - -
2870 =item M1 - M3 - - M6
2872 =item - - O3 - O5 - - x x O10
2874 =item E1 - - E4 - - -
2876 =item - - - - - - - - -
2878 =item - -
2880 =back
2882 B<papply> does not print the output if the command fails:
2884 $ papply 'echo %F; false' foo
2885 "echo foo; false" did not succeed
2887 B<papply>'s replacement strings (%F %d %f %n %e %z) can be simulated in GNU
2888 B<parallel> by putting this in B<~/.parallel/config>:
2890 --rpl '%F'
2891 --rpl '%d $_=Q(::dirname($_));'
2892 --rpl '%f s:.*/::;'
2893 --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2894 --rpl '%e s:.*\.:.:'
2895 --rpl '%z $_=""'
2897 B<papply> buffers in RAM, and uses twice the amount of output. So
2898 output of 5 GB takes 10 GB RAM.
2900 The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2901 seconds (compared to 10 seconds with GNU B<parallel>).
2904 =head3 Examples as GNU Parallel
2906 1$ papply gzip *.txt
2908 1$ parallel gzip ::: *.txt
2910 2$ papply "convert %F %n.jpg" *.png
2912 2$ parallel convert {} {.}.jpg ::: *.png
2915 https://pypi.org/project/papply/
2916 (Last checked: 2020-04)
2919 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
2921 Summary (see legend above):
2923 =over
2925 =item - - - I4 - - I7
2927 =item - - - - - M6
2929 =item - O2 O3 - O5 O6 - x x O10
2931 =item E1 - - E4 - E6 -
2933 =item - - - - - - - - -
2935 =item S1 S2
2937 =back
2939 B<async> is very similary to GNU B<parallel>'s B<--semaphore> mode
2940 (aka B<sem>). B<async> requires the user to start a server process.
2942 The input is quoted like B<-q> so you need B<bash -c "...;..."> to run
2943 composed commands.
2945 =head3 Examples as GNU Parallel
2947 1$ S="/tmp/example_socket"
2949 1$ ID=myid
2951 2$ async -s="$S" server --start
2953 2$ # GNU Parallel does not need a server to run
2955 3$ for i in {1..20}; do
2956 # prints command output to stdout
2957 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2958 done
2960 3$ for i in {1..20}; do
2961 # prints command output to stdout
2962 sem --id "$ID" -j100% "sleep 1 && echo test $i"
2963 # GNU Parallel will only print job when it is done
2964 # If you need output from different jobs to mix
2965 # use -u or --line-buffer
2966 sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2967 done
2969 4$ # wait until all commands are finished
2970 async -s="$S" wait
2972 4$ sem --id "$ID" --wait
2974 5$ # configure the server to run four commands in parallel
2975 async -s="$S" server -j4
2977 5$ export PARALLEL=-j4
2979 6$ mkdir "/tmp/ex_dir"
2980 for i in {21..40}; do
2981 # redirects command output to /tmp/ex_dir/file*
2982 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2983 bash -c "sleep 1 && echo test $i"
2984 done
2986 6$ mkdir "/tmp/ex_dir"
2987 for i in {21..40}; do
2988 # redirects command output to /tmp/ex_dir/file*
2989 sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
2990 "sleep 1 && echo test $i"
2991 done
2993 7$ sem --id "$ID" --wait
2995 7$ async -s="$S" wait
2997 8$ # stops server
2998 async -s="$S" server --stop
3000 8$ # GNU Parallel does not need to stop a server
3003 https://github.com/ctbur/async/
3004 (Last checked: 2023-01)
3007 =head2 DIFFERENCES BETWEEN pardi AND GNU Parallel
3009 Summary (see legend above):
3011 =over
3013 =item I1 I2 - - - - I7
3015 =item M1 - - - - M6
3017 =item O1 O2 O3 O4 O5 - O7 - - O10
3019 =item E1 - - E4 - - -
3021 =item - - - - - - - - -
3023 =item - -
3025 =back
3027 B<pardi> is very similar to B<parallel --pipe --cat>: It reads blocks
3028 of data and not arguments. So it cannot insert an argument in the
3029 command line. It puts the block into a temporary file, and this file
3030 name (%IN) can be put in the command line. You can only use %IN once.
3032 It can also run full command lines in parallel (like: B<cat file |
3033 parallel>).
3035 =head3 EXAMPLES FROM pardi test.sh
3037 1$ time pardi -v -c 100 -i data/decoys.smi -ie .smi -oe .smi \
3038 -o data/decoys_std_pardi.smi \
3039 -w '(standardiser -i %IN -o %OUT 2>&1) > /dev/null'
3041 1$ cat data/decoys.smi |
3042 time parallel -N 100 --pipe --cat \
3043 '(standardiser -i {} -o {#} 2>&1) > /dev/null; cat {#}; rm {#}' \
3044 > data/decoys_std_pardi.smi
3046 2$ pardi -n 1 -i data/test_in.types -o data/test_out.types \
3047 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3049 2$ cat data/test_in.types |
3050 parallel -n 1 -k --pipe --cat --regexp --recstart '^#atoms' \
3051 'cat {}' > data/test_out.types
3053 3$ pardi -c 6 -i data/test_in.types -o data/test_out.types \
3054 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3056 3$ cat data/test_in.types |
3057 parallel -n 6 -k --pipe --cat --regexp --recstart '^#atoms' \
3058 'cat {}' > data/test_out.types
3060 4$ pardi -i data/decoys.mol2 -o data/still_decoys.mol2 \
3061 -d 's:@<TRIPOS>MOLECULE' -w 'cp %IN %OUT'
3063 4$ cat data/decoys.mol2 |
3064 parallel -n 1 --pipe --cat --recstart '@<TRIPOS>MOLECULE' \
3065 'cp {} {#}; cat {#}; rm {#}' > data/still_decoys.mol2
3067 5$ pardi -i data/decoys.mol2 -o data/decoys2.mol2 \
3068 -d b:10000 -w 'cp %IN %OUT' --preserve
3070 5$ cat data/decoys.mol2 |
3071 parallel -k --pipe --block 10k --recend '' --cat \
3072 'cat {} > {#}; cat {#}; rm {#}' > data/decoys2.mol2
3074 https://github.com/UnixJunkie/pardi
3075 (Last checked: 2021-01)
3078 =head2 DIFFERENCES BETWEEN bthread AND GNU Parallel
3080 Summary (see legend above):
3082 =over
3084 =item - - - I4 - - -
3086 =item - - - - - M6
3088 =item O1 - O3 - - - O7 O8 - -
3090 =item E1 - - - - - -
3092 =item - - - - - - - - -
3094 =item - -
3096 =back
3098 B<bthread> takes around 1 sec per MB of output. The maximal output
3099 line length is 1073741759.
3101 You cannot quote space in the command, so you cannot run composed
3102 commands like B<sh -c "echo a; echo b">.
3104 https://gitlab.com/netikras/bthread
3105 (Last checked: 2021-01)
3108 =head2 DIFFERENCES BETWEEN simple_gpu_scheduler AND GNU Parallel
3110 Summary (see legend above):
3112 =over
3114 =item I1 - - - - - I7
3116 =item M1 - - - - M6
3118 =item - O2 O3 - - O6 - x x O10
3120 =item E1 - - - - - -
3122 =item - - - - - - - - -
3124 =item - -
3126 =back
3128 =head3 EXAMPLES FROM simple_gpu_scheduler MANUAL
3130 1$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
3132 1$ parallel -j3 --shuf \
3133 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' \
3134 < gpu_commands.txt
3136 2$ simple_hypersearch \
3137 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3138 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3139 simple_gpu_scheduler --gpus 0,1,2
3141 2$ parallel --header : --shuf -j3 -v \
3142 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' \
3143 python3 train_dnn.py --lr {lr} --batch_size {bs} \
3144 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3146 3$ simple_hypersearch \
3147 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3148 --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3149 simple_gpu_scheduler --gpus 0,1,2
3151 3$ parallel --header : --shuf \
3152 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq()>5 and skip() =}' \
3153 python3 train_dnn.py --lr {lr} --batch_size {bs} \
3154 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3156 4$ touch gpu.queue
3157 tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
3158 echo "my_command_with | and stuff > logfile" >> gpu.queue
3160 4$ touch gpu.queue
3161 tail -f -n 0 gpu.queue |
3162 parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
3163 # Needed to fill job slots once
3164 seq 3 | parallel echo true >> gpu.queue
3165 # Add jobs
3166 echo "my_command_with | and stuff > logfile" >> gpu.queue
3167 # Needed to flush output from completed jobs
3168 seq 3 | parallel echo true >> gpu.queue
3170 https://github.com/ExpectationMax/simple_gpu_scheduler
3171 (Last checked: 2021-01)
3174 =head2 DIFFERENCES BETWEEN parasweep AND GNU Parallel
3176 B<parasweep> is a Python module for facilitating parallel parameter
3177 sweeps.
3179 A B<parasweep> job will normally take a text file as input. The text
3180 file contains arguments for the job. Some of these arguments will be
3181 fixed and some of them will be changed by B<parasweep>.
3183 It does this by having a template file such as template.txt:
3185 Xval: {x}
3186 Yval: {y}
3187 FixedValue: 9
3188 # x with 2 decimals
3189 DecimalX: {x:.2f}
3190 TenX: ${x*10}
3191 RandomVal: {r}
3193 and from this template it generates the file to be used by the job by
3194 replacing the replacement strings.
3196 Being a Python module B<parasweep> integrates tighter with Python than
3197 GNU B<parallel>. You get the parameters directly in a Python data
3198 structure. With GNU B<parallel> you can use the JSON or CSV output
3199 format to get something similar, but you would have to read the
3200 output.
3202 B<parasweep> has a filtering method to ignore parameter combinations
3203 you do not need.
3205 Instead of calling the jobs directly, B<parasweep> can use Python's
3206 Distributed Resource Management Application API to make jobs run with
3207 different cluster software.
3210 GNU B<parallel> B<--tmpl> supports templates with replacement
3211 strings. Such as:
3213 Xval: {x}
3214 Yval: {y}
3215 FixedValue: 9
3216 # x with 2 decimals
3217 DecimalX: {=x $_=sprintf("%.2f",$_) =}
3218 TenX: {=x $_=$_*10 =}
3219 RandomVal: {=1 $_=rand() =}
3221 that can be used like:
3223 parallel --header : --tmpl my.tmpl={#}.t myprog {#}.t \
3224 ::: x 1 2 3 ::: y 1 2 3
3226 Filtering is supported as:
3228 parallel --filter '{1} > {2}' echo ::: 1 2 3 ::: 1 2 3
3230 https://github.com/eviatarbach/parasweep
3231 (Last checked: 2021-01)
3234 =head2 DIFFERENCES BETWEEN parallel-bash AND GNU Parallel
3236 Summary (see legend above):
3238 =over
3240 =item I1 I2 - - - - -
3242 =item - - M3 - - M6
3244 =item - O2 O3 - O5 O6 - O8 x O10
3246 =item E1 - - - - - -
3248 =item - - - - - - - - -
3250 =item - -
3252 =back
3254 B<parallel-bash> is written in pure bash. It is really fast (overhead
3255 of ~0.05 ms/job compared to GNU B<parallel>'s 3-10 ms/job). So if your
3256 jobs are extremely short lived, and you can live with the quite
3257 limited command, this may be useful.
3259 It works by making a queue for each process. Then the jobs are
3260 distributed to the queues in a round robin fashion. Finally the queues
3261 are started in parallel. This works fine, if you are lucky, but if
3262 not, all the long jobs may end up in the same queue, so you may see:
3264 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3265 time parallel -P4 sleep {}
3266 (7 seconds)
3267 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3268 time ./parallel-bash.bash -p 4 -c sleep {}
3269 (12 seconds)
3271 Because it uses bash lists, the total number of jobs is limited to
3272 167000..265000 depending on your environment. You get a segmentation
3273 fault, when you reach the limit.
3275 Ctrl-C does not stop spawning new jobs. Ctrl-Z does not suspend
3276 running jobs.
3279 =head3 EXAMPLES FROM parallel-bash
3281 1$ some_input | parallel-bash -p 5 -c echo
3283 1$ some_input | parallel -j 5 echo
3285 2$ parallel-bash -p 5 -c echo < some_file
3287 2$ parallel -j 5 echo < some_file
3289 3$ parallel-bash -p 5 -c echo <<< 'some string'
3291 3$ parallel -j 5 -c echo <<< 'some string'
3293 4$ something | parallel-bash -p 5 -c echo {} {}
3295 4$ something | parallel -j 5 echo {} {}
3297 https://reposhub.com/python/command-line-tools/Akianonymus-parallel-bash.html
3298 (Last checked: 2021-06)
3301 =head2 DIFFERENCES BETWEEN bash-concurrent AND GNU Parallel
3303 B<bash-concurrent> is more an alternative to B<make> than to GNU
3304 B<parallel>. Its input is very similar to a Makefile, where jobs
3305 depend on other jobs.
3307 It has a nice progress indicator where you can see which jobs
3308 completed successfully, which jobs are currently running, which jobs
3309 failed, and which jobs were skipped due to a depending job failed.
3310 The indicator does not deal well with resizing the window.
3312 Output is cached in tempfiles on disk, but is only shown if there is
3313 an error, so it is not meant to be part of a UNIX pipeline. If
3314 B<bash-concurrent> crashes these tempfiles are not removed.
3316 It uses an O(n*n) algorithm, so if you have 1000 independent jobs it
3317 takes 22 seconds to start it.
3319 https://github.com/themattrix/bash-concurrent
3320 (Last checked: 2021-02)
3323 =head2 DIFFERENCES BETWEEN spawntool AND GNU Parallel
3325 Summary (see legend above):
3327 =over
3329 =item I1 - - - - - -
3331 =item M1 - - - - M6
3333 =item - O2 O3 - O5 O6 - x x O10
3335 =item E1 - - - - - -
3337 =item - - - - - - - - -
3339 =item - -
3341 =back
3343 B<spawn> reads a full command line from stdin which it executes in
3344 parallel.
3347 http://code.google.com/p/spawntool/
3348 (Last checked: 2021-07)
3351 =head2 DIFFERENCES BETWEEN go-pssh AND GNU Parallel
3353 Summary (see legend above):
3355 =over
3357 =item - - - - - - -
3359 =item M1 - - - - -
3361 =item O1 - - - - - - x x O10
3363 =item E1 - - - - - -
3365 =item R1 R2 - - - R6 - - -
3367 =item - -
3369 =back
3371 B<go-pssh> does B<ssh> in parallel to multiple machines. It runs the
3372 same command on multiple machines similar to B<--nonall>.
3374 The hostnames must be given as IP-addresses (not as hostnames).
3376 Output is sent to stdout (standard output) if command is successful,
3377 and to stderr (standard error) if the command fails.
3379 =head3 EXAMPLES FROM go-pssh
3381 1$ go-pssh -l <ip>,<ip> -u <user> -p <port> -P <passwd> -c "<command>"
3383 1$ parallel -S 'sshpass -p <passwd> ssh -p <port> <user>@<ip>' \
3384 --nonall "<command>"
3386 2$ go-pssh scp -f host.txt -u <user> -p <port> -P <password> \
3387 -s /local/file_or_directory -d /remote/directory
3389 2$ parallel --nonall --slf host.txt \
3390 --basefile /local/file_or_directory/./ --wd /remote/directory
3391 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3393 3$ go-pssh scp -l <ip>,<ip> -u <user> -p <port> -P <password> \
3394 -s /local/file_or_directory -d /remote/directory
3396 3$ parallel --nonall -S <ip>,<ip> \
3397 --basefile /local/file_or_directory/./ --wd /remote/directory
3398 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3400 https://github.com/xuchenCN/go-pssh
3401 (Last checked: 2021-07)
3404 =head2 DIFFERENCES BETWEEN go-parallel AND GNU Parallel
3406 Summary (see legend above):
3408 =over
3410 =item I1 I2 - - - - I7
3412 =item - - M3 - - M6
3414 =item - O2 O3 - O5 - - x x - O10
3416 =item E1 - - E4 - - -
3418 =item - - - - - - - - -
3420 =item - -
3422 =back
3424 B<go-parallel> uses Go templates for replacement strings. Quite
3425 similar to the I<{= perl expr =}> replacement string.
3427 =head3 EXAMPLES FROM go-parallel
3429 1$ go-parallel -a ./files.txt -t 'cp {{.Input}} {{.Input | dirname | dirname}}'
3431 1$ parallel -a ./files.txt cp {} '{= $_=::dirname(::dirname($_)) =}'
3433 2$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{noExt .Input}}'
3435 2$ parallel -a ./files.txt echo mkdir -p {} {.}
3437 3$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{.Input | basename | noExt}}'
3439 3$ parallel -a ./files.txt echo mkdir -p {} {/.}
3441 https://github.com/mylanconnolly/parallel
3442 (Last checked: 2021-07)
3445 =head2 DIFFERENCES BETWEEN p AND GNU Parallel
3447 Summary (see legend above):
3449 =over
3451 =item - - - I4 - - x
3453 =item - - - - - M6
3455 =item - O2 O3 - O5 O6 - x x - O10
3457 =item E1 - - - - - -
3459 =item - - - - - - - - -
3461 =item - -
3463 =back
3465 B<p> is a tiny shell script. It can color output with some predefined
3466 colors, but is otherwise quite limited.
3468 It maxes out at around 116000 jobs (probably due to limitations in Bash).
3470 =head3 EXAMPLES FROM p
3472 Some of the examples from B<p> cannot be implemented 100% by GNU
3473 B<parallel>: The coloring is a bit different, and GNU B<parallel>
3474 cannot have B<--tag> for some inputs and not for others.
3476 The coloring done by GNU B<parallel> is not exactly the same as B<p>.
3478 1$ p -bc blue "ping 127.0.0.1" -uc red "ping 192.168.0.1" \
3479 -rc yellow "ping 192.168.1.1" -t example "ping example.com"
3481 1$ parallel --lb -j0 --color --tag ping \
3482 ::: 127.0.0.1 192.168.0.1 192.168.1.1 example.com
3484 2$ p "tail -f /var/log/httpd/access_log" \
3485 -bc red "tail -f /var/log/httpd/error_log"
3487 2$ cd /var/log/httpd;
3488 parallel --lb --color --tag tail -f ::: access_log error_log
3490 3$ p tail -f "some file" \& p tail -f "other file with space.txt"
3492 3$ parallel --lb tail -f ::: 'some file' "other file with space.txt"
3494 4$ p -t project1 "hg pull project1" -t project2 \
3495 "hg pull project2" -t project3 "hg pull project3"
3497 4$ parallel --lb hg pull ::: project{1..3}
3499 https://github.com/rudymatela/evenmoreutils/blob/master/man/p.1.adoc
3500 (Last checked: 2022-04)
3503 =head2 DIFFERENCES BETWEEN senechal AND GNU Parallel
3505 Summary (see legend above):
3507 =over
3509 =item I1 - - - - - -
3511 =item M1 - M3 - - M6
3513 =item O1 - O3 O4 - - - x x -
3515 =item E1 - - - - - -
3517 =item - - - - - - - - -
3519 =item - -
3521 =back
3523 B<seneschal> only starts the first job after reading the last job, and
3524 output from the first job is only printed after the last job finishes.
3526 1 byte of output requites 3.5 bytes of RAM.
3528 This makes it impossible to have a total output bigger than the
3529 virtual memory.
3531 Even though output is kept in RAM outputing is quite slow: 30 MB/s.
3533 Output larger than 4 GB causes random problems - it looks like a race
3534 condition.
3536 This:
3538 echo 1 | seneschal --prefix='yes `seq 1000`|head -c 1G' >/dev/null
3540 takes 4100(!) CPU seconds to run on a 64C64T server, but only 140 CPU
3541 seconds on a 4C8T laptop. So it looks like B<seneschal> wastes a lot
3542 of CPU time coordinating the CPUs.
3544 Compare this to:
3546 echo 1 | time -v parallel -N0 'yes `seq 1000`|head -c 1G' >/dev/null
3548 which takes 3-8 CPU seconds.
3550 =head3 EXAMPLES FROM seneschal README.md
3552 1$ echo $REPOS | seneschal --prefix="cd {} && git pull"
3554 # If $REPOS is newline separated
3555 1$ echo "$REPOS" | parallel -k "cd {} && git pull"
3556 # If $REPOS is space separated
3557 1$ echo -n "$REPOS" | parallel -d' ' -k "cd {} && git pull"
3559 COMMANDS="pwd
3560 sleep 5 && echo boom
3561 echo Howdy
3562 whoami"
3564 2$ echo "$COMMANDS" | seneschal --debug
3566 2$ echo "$COMMANDS" | parallel -k -v
3568 3$ ls -1 | seneschal --prefix="pushd {}; git pull; popd;"
3570 3$ ls -1 | parallel -k "pushd {}; git pull; popd;"
3571 # Or if current dir also contains files:
3572 3$ parallel -k "pushd {}; git pull; popd;" ::: */
3574 https://github.com/TheWizardTower/seneschal
3575 (Last checked: 2022-06)
3578 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
3580 Summary (see legend above):
3582 =over
3584 =item x x x x x x x
3586 =item - x x x x x
3588 =item x O2 O3 O4 O5 O6 - x x O10
3590 =item E1 - - E4 - - -
3592 =item - - - - - - - - -
3594 =item S1 S2
3596 =back
3598 B<async> works like B<sem>.
3601 =head3 EXAMPLES FROM async
3603 1$ S="/tmp/example_socket"
3605 async -s="$S" server --start
3607 for i in {1..20}; do
3608 # prints command output to stdout
3609 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
3610 done
3612 # wait until all commands are finished
3613 async -s="$S" wait
3615 1$ S="example_id"
3617 # server not needed
3619 for i in {1..20}; do
3620 # prints command output to stdout
3621 sem --bg --id "$S" -j100% "sleep 1 && echo test $i"
3622 done
3624 # wait until all commands are finished
3625 sem --fg --id "$S" --wait
3627 2$ # configure the server to run four commands in parallel
3628 async -s="$S" server -j4
3630 mkdir "/tmp/ex_dir"
3631 for i in {21..40}; do
3632 # redirects command output to /tmp/ex_dir/file*
3633 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
3634 bash -c "sleep 1 && echo test $i"
3635 done
3637 async -s="$S" wait
3639 # stops server
3640 async -s="$S" server --stop
3642 2$ # starting server not needed
3644 mkdir "/tmp/ex_dir"
3645 for i in {21..40}; do
3646 # redirects command output to /tmp/ex_dir/file*
3647 sem --bg --id "$S" --results "/tmp/ex_dir/file$i{}" \
3648 "sleep 1 && echo test $i"
3649 done
3651 sem --fg --id "$S" --wait
3653 # there is no server to stop
3655 https://github.com/ctbur/async
3656 (Last checked: 2023-01)
3659 =head2 DIFFERENCES BETWEEN tandem AND GNU Parallel
3661 Summary (see legend above):
3663 =over
3665 =item - - - I4 - - x
3667 =item M1 - - - - M6
3669 =item - - O3 - - - - x - -
3671 =item E1 - E3 - E5 - -
3673 =item - - - - - - - - -
3675 =item - -
3677 =back
3679 B<tandem> runs full commands in parallel. It is made for starting a
3680 "server", running a job against the server, and when the job is done,
3681 the server is killed.
3683 More generally: it kills all jobs when the first job completes -
3684 similar to '--halt now,done=1'.
3686 B<tandem> silently discards some output. It is unclear exactly when
3687 this happens. It looks like a race condition, because it varies for
3688 each run.
3690 $ tandem "seq 10000" | wc -l
3691 6731 <- This should always be 10002
3694 =head3 EXAMPLES FROM Demo
3696 tandem \
3697 'php -S localhost:8000' \
3698 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3699 'tailwind -i src/index.css -o dist/index.css --watch'
3701 # Emulate tandem's behaviour
3702 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3703 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3704 export PARALLEL
3706 parallel ::: \
3707 'php -S localhost:8000' \
3708 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3709 'tailwind -i src/index.css -o dist/index.css --watch'
3712 =head3 EXAMPLES FROM tandem -h
3714 # Emulate tandem's behaviour
3715 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3716 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3717 export PARALLEL
3719 1$ tandem 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3721 1$ parallel ::: 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3723 # '-t 0' fails. But '--timeout 0 works'
3724 2$ tandem --timeout 0 'sleep 5 && echo "hello"' \
3725 'sleep 2 && echo "world"'
3727 2$ parallel --timeout 0 ::: 'sleep 5 && echo "hello"' \
3728 'sleep 2 && echo "world"'
3730 =head3 EXAMPLES FROM tandem's readme.md
3732 # Emulate tandem's behaviour
3733 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3734 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3735 export PARALLEL
3737 1$ tandem 'next dev' 'nodemon --quiet ./server.js'
3739 1$ parallel ::: 'next dev' 'nodemon --quiet ./server.js'
3741 2$ cat package.json
3743 "scripts": {
3744 "dev:php": "...",
3745 "dev:js": "...",
3746 "dev:css": "..."
3750 tandem 'npm:dev:php' 'npm:dev:js' 'npm:dev:css'
3752 # GNU Parallel uses bash functions instead
3753 2$ cat package.sh
3754 dev:php() { ... ; }
3755 dev:js() { ... ; }
3756 dev:css() { ... ; }
3757 export -f dev:php dev:js dev:css
3759 . package.sh
3760 parallel ::: dev:php dev:js dev:css
3762 3$ tandem 'npm:dev:*'
3764 3$ compgen -A function | grep ^dev: | parallel
3766 For usage in Makefiles, include a copy of GNU Parallel with your
3767 source using `parallel --embed`. This has the added benefit of also
3768 working if access to the internet is down or restricted.
3770 https://github.com/rosszurowski/tandem
3771 (Last checked: 2023-01)
3774 =head2 DIFFERENCES BETWEEN rust-parallel(aaronriekenberg) AND GNU Parallel
3776 Summary (see legend above):
3778 =over
3780 =item I1 I2 I3 - - - -
3782 =item - - - - - M6
3784 =item O1 O2 O3 - O5 O6 - x - O10
3786 =item E1 - - E4 - - -
3788 =item - - - - - - - - -
3790 =item - -
3792 =back
3794 B<rust-parallel> has a goal of only using Rust. It seems it is
3795 impossible to call bash functions from the command line. You would
3796 need to put these in a script.
3798 Calling a script that misses the shebang line (#! as first line)
3799 fails.
3801 =head3 EXAMPLES FROM rust-parallel's README.md
3803 $ cat >./test <<EOL
3804 echo hi
3805 echo there
3806 echo how
3807 echo are
3808 echo you
3811 1$ cat test | rust-parallel -j5
3813 1$ cat test | parallel -j5
3815 2$ cat test | rust-parallel -j1
3817 2$ cat test | parallel -j1
3819 3$ head -100 /usr/share/dict/words | rust-parallel md5 -s
3821 3$ head -100 /usr/share/dict/words | parallel md5 -s
3823 4$ find . -type f -print0 | rust-parallel -0 gzip -f -k
3825 4$ find . -type f -print0 | parallel -0 gzip -f -k
3827 5$ head -100 /usr/share/dict/words |
3828 awk '{printf "md5 -s %s\n", $1}' | rust-parallel
3830 5$ head -100 /usr/share/dict/words |
3831 awk '{printf "md5 -s %s\n", $1}' | parallel
3833 6$ head -100 /usr/share/dict/words | rust-parallel md5 -s |
3834 grep -i abba
3836 6$ head -100 /usr/share/dict/words | parallel md5 -s |
3837 grep -i abba
3839 https://github.com/aaronriekenberg/rust-parallel
3840 (Last checked: 2023-01)
3843 =head2 DIFFERENCES BETWEEN parallelium AND GNU Parallel
3845 Summary (see legend above):
3847 =over
3849 =item - I2 - - - - -
3851 =item M1 - - - - M6
3853 =item O1 - O3 - - - - x - -
3855 =item E1 - - E4 - - -
3857 =item - - - - - - - - -
3859 =item - -
3861 =back
3863 B<parallelium> merges standard output (stdout) and standard error
3864 (stderr). The maximal output of a command is 8192 bytes. Bigger output
3865 makes B<parallelium> go into an infinite loop.
3867 In the input file for B<parallelium> you can define a tag, so that you
3868 can select to run only these commands. A bit like a target in a
3869 Makefile.
3871 Progress is printed on standard output (stdout) prepended with '#'
3872 with similar information as GNU B<parallel>'s B<--bar>.
3874 =head3 EXAMPLES
3876 $ cat testjobs.txt
3877 #tag common sleeps classA
3878 (sleep 4.495;echo "job 000")
3880 (sleep 2.587;echo "job 016")
3882 #tag common sleeps classB
3883 (sleep 0.218;echo "job 017")
3885 (sleep 2.269;echo "job 040")
3887 #tag common sleeps classC
3888 (sleep 2.586;echo "job 041")
3890 (sleep 1.626;echo "job 099")
3892 #tag lasthalf, sleeps, classB
3893 (sleep 1.540;echo "job 100")
3895 (sleep 2.001;echo "job 199")
3897 1$ parallelium -f testjobs.txt -l logdir -t classB,classC
3899 1$ cat testjobs.txt |
3900 parallel --plus --results logdir/testjobs.txt_{0#}.output \
3901 '{= if(/^#tag /) { @tag = split/,|\s+/ }
3902 (grep /^(classB|classC)$/, @tag) or skip =}'
3904 https://github.com/beomagi/parallelium
3905 (Last checked: 2023-01)
3908 =head2 DIFFERENCES BETWEEN forkrun AND GNU Parallel
3910 Summary (see legend above):
3912 =over
3914 =item I1 - - - - - I7
3916 =item - - - - - -
3918 =item - O2 O3 - O5 - - - - O10
3920 =item E1 - - E4 - - -
3922 =item - - - - - - - - -
3924 =item - -
3926 =back
3929 B<forkrun> blocks if it receives fewer jobs than slots:
3931 echo | forkrun -p 2 echo
3933 or when it gets some specific commands e.g.:
3935 f() { seq "$@" | pv -qL 3; }
3936 seq 10 | forkrun f
3938 It is not clear why.
3940 It is faster than GNU B<parallel> (overhead: 1.2 ms/job vs 3 ms/job),
3941 but way slower than B<parallel-bash> (0.059 ms/job).
3943 Running jobs cannot be stopped by pressing CTRL-C.
3945 B<-k> is supposed to keep the order but fails on the MIX testing
3946 example below. If used with B<-k> it caches output in RAM.
3948 If B<forkrun> is killed, it leaves temporary files in
3949 B</tmp/.forkrun.*> that has to be cleaned up manually.
3951 =head3 EXAMPLES
3953 1$ time find ./ -type f |
3954 forkrun -l512 -- sha256sum 2>/dev/null | wc -l
3955 1$ time find ./ -type f |
3956 parallel -j28 -m -- sha256sum 2>/dev/null | wc -l
3958 2$ time find ./ -type f |
3959 forkrun -l512 -k -- sha256sum 2>/dev/null | wc -l
3960 2$ time find ./ -type f |
3961 parallel -j28 -k -m -- sha256sum 2>/dev/null | wc -l
3963 https://github.com/jkool702/forkrun
3964 (Last checked: 2023-02)
3967 =head2 DIFFERENCES BETWEEN parallel-sh AND GNU Parallel
3969 Summary (see legend above):
3971 =over
3973 =item I1 I2 - I4 - - -
3975 =item M1 - - - - M6
3977 =item O1 O2 O3 - O5 O6 - - - O10
3979 =item E1 - - E4 - - -
3981 =item - - - - - - - - -
3983 =item - -
3985 =back
3987 B<parallel-sh> buffers in RAM. The buffering data takes O(n^1.5) time:
3989 2MB=0.107s 4MB=0.175s 8MB=0.342s 16MB=0.766s 32MB=2.2s 64MB=6.7s
3990 128MB=20s 256MB=64s 512MB=248s 1024MB=998s 2048MB=3756s
3992 It limits the practical usability to jobs outputting < 256 MB. GNU
3993 B<parallel> buffers on disk, yet is faster for jobs with outputs > 16
3994 MB and is only limited by the free space in $TMPDIR.
3996 B<parallel-sh> can kill running jobs if a job fails (Similar to
3997 B<--halt now,fail=1>).
3999 =head3 EXAMPLES
4001 1$ parallel-sh "sleep 2 && echo first" "sleep 1 && echo second"
4003 1$ parallel ::: "sleep 2 && echo first" "sleep 1 && echo second"
4005 2$ cat /tmp/commands
4006 sleep 2 && echo first
4007 sleep 1 && echo second
4009 2$ parallel-sh -f /tmp/commands
4011 2$ parallel -a /tmp/commands
4013 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4014 parallel-sh
4016 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4017 parallel
4019 https://github.com/thyrc/parallel-sh
4020 (Last checked: 2023-04)
4023 =head2 DIFFERENCES BETWEEN bash-parallel AND GNU Parallel
4025 Summary (see legend above):
4027 =over
4029 =item - I2 - - - - I7
4031 =item M1 - M3 - M5 M6
4033 =item - O2 O3 - - O6 - O8 - O10
4035 =item E1 - - - - - -
4037 =item - - - - - - - - -
4039 =item - -
4041 =back
4043 B<bash-parallel> is not as much a command as it is a shell script that
4044 you have to alter. It requires you to change the shell function
4045 process_job that runs the job, and set $MAX_POOL_SIZE to the number of
4046 jobs to run in parallel.
4048 It is half as fast as GNU B<parallel> for short jobs.
4050 https://github.com/thilinaba/bash-parallel
4051 (Last checked: 2023-05)
4054 =head2 DIFFERENCES BETWEEN PaSH AND GNU Parallel
4056 Summary (see legend above): N/A
4058 B<pash> is quite different from GNU B<parallel>. It is not a general
4059 parallelizer. It takes a shell script and analyses it and parallelizes
4060 parts of it by replacing the parts with commands that will give the same
4061 result.
4063 This will replace B<sort> with a command that does pretty much the
4064 same as B<parsort --parallel=8> (except somewhat slower):
4066 pa.sh --width 8 -c 'cat bigfile | sort'
4068 However, even a simple change will confuse B<pash> and you will get no
4069 parallelization:
4071 pa.sh --width 8 -c 'mysort() { sort; }; cat bigfile | mysort'
4072 pa.sh --width 8 -c 'cat bigfile | sort | md5sum'
4074 From the source it seems B<pash> only looks at: awk cat col comm cut
4075 diff grep head mkfifo mv rm sed seq sort tail tee tr uniq wc xargs
4077 For pipelines where these commands are bottlenecks, it might be worth
4078 testing if B<pash> is faster than GNU B<parallel>.
4080 B<pash> does not respect $TMPDIR but always uses /tmp. If B<pash> dies
4081 unexpectantly it does not clean up.
4083 https://github.com/binpash/pash
4084 (Last checked: 2023-05)
4087 =head2 DIFFERENCES BETWEEN korovkin-parallel AND GNU Parallel
4089 Summary (see legend above):
4091 =over
4093 =item I1 - - - - - -
4095 =item M1 - - - - M6
4097 =item - - O3 - - - - x x -
4099 =item E1 - - - - - -
4101 =item R1 - - - - R6 x x -
4103 =item - -
4105 =back
4107 B<korovkin-parallel> prepends all lines with some info.
4109 The output is colored with 6 color combinations, so job 1 and 7 will
4110 get the same color.
4112 You can get similar output with:
4114 (echo ...) |
4115 parallel --color -j 10 --lb --tagstring \
4116 '[l:{#}:{=$_=sprintf("%7.03f",::now()-$^T)=} {=$_=hh_mm_ss($^T)=} {%}]'
4118 Lines longer than 8192 chars are broken into lines shorter than
4119 8192. B<korovkin-parallel> loses the last char for lines exactly 8193
4120 chars long.
4122 Short lines from different jobs do not mix, but long lines do:
4124 fun() {
4125 perl -e '$a="'$1'"x1000000; for(1..'$2') { print $a };';
4126 echo;
4128 export -f fun
4129 (echo fun a 100;echo fun b 100) | korovkin-parallel | tr -s abcdef
4130 # Compare to:
4131 (echo fun a 100;echo fun b 100) | parallel | tr -s abcdef
4133 There should be only one line of a's and one line of b's.
4135 Just like GNU B<parallel> B<korovkin-parallel> offers a master/slave
4136 model, so workers on other servers can do some of the tasks. But
4137 contrary to GNU B<parallel> you must manually start workers on these
4138 servers. The communication is neither authenticated nor encrypted.
4140 It caches output in RAM: a 1GB line uses ~2.5GB RAM
4142 https://github.com/korovkin/parallel
4143 (Last checked: 2023-07)
4146 =head2 DIFFERENCES BETWEEN xe AND GNU Parallel
4148 Summary (see legend above):
4150 =over
4152 =item I1 I2 - I4 - - I7
4154 =item M1 - M3 M4 - M6
4156 =item - O2 O3 - O5 O6 - O8 - O10
4158 =item E1 - - E4 - - -
4160 =item - - - - - - - - -
4162 =item - -
4164 =back
4166 B<xe> has a peculiar limitation:
4168 echo /bin/echo | xe {} OK
4169 echo echo | xe /bin/{} fails
4172 =head3 EXAMPLES
4174 Compress all .c files in the current directory, using all CPU cores:
4176 1$ xe -a -j0 gzip -- *.c
4178 1$ parallel gzip ::: *.c
4180 Remove all empty files, using lr(1):
4182 2$ lr -U -t 'size == 0' | xe -N0 rm
4184 2$ lr -U -t 'size == 0' | parallel -X rm
4186 Convert .mp3 to .ogg, using all CPU cores:
4188 3$ xe -a -j0 -s 'ffmpeg -i "${1}" "${1%.mp3}.ogg"' -- *.mp3
4190 3$ parallel ffmpeg -i {} {.}.ogg ::: *.mp3
4192 Same, using percent rules:
4194 4$ xe -a -j0 -p %.mp3 ffmpeg -i %.mp3 %.ogg -- *.mp3
4196 4$ parallel --rpl '% s/\.mp3// or skip' ffmpeg -i %.mp3 %.ogg ::: *.mp3
4198 Similar, but hiding output of ffmpeg, instead showing spawned jobs:
4200 5$ xe -ap -j0 -vvq '%.{m4a,ogg,opus}' ffmpeg -y -i {} out/%.mp3 -- *
4202 5$ parallel -v --rpl '% s/\.(m4a|ogg|opus)// or skip' \
4203 ffmpeg -y -i {} out/%.mp3 '2>/dev/null' ::: *
4205 5$ parallel -v ffmpeg -y -i {} out/{.}.mp3 '2>/dev/null' ::: *
4207 https://github.com/leahneukirchen/xe
4208 (Last checked: 2023-08)
4211 =head2 DIFFERENCES BETWEEN sp AND GNU Parallel
4213 Summary (see legend above):
4215 =over
4217 =item - - - I4 - - -
4219 =item M1 - M3 - - M6
4221 =item - O2 O3 - O5 (O6) - x x O10
4223 =item E1 - - - - - -
4225 =item - - - - - - - - -
4227 =item - -
4229 =back
4231 B<sp> has very few options.
4233 It can either be used like:
4235 sp command {} option :: arg1 arg2 arg3
4237 which is similar to:
4239 parallel command {} option ::: arg1 arg2 arg3
4243 sp command1 :: "command2 -option" :: "command3 foo bar"
4245 which is similar to:
4247 parallel ::: command1 "command2 -option" "command3 foo bar"
4249 B<sp> deals badly with too many commands: This causes B<sp> to run out
4250 of file handles and gives data loss.
4252 For each command that fails, B<sp> will print an error message on
4253 stderr (standard error).
4255 You cannot used exported shell functions as commands.
4257 =head3 EXAMPLES
4259 1$ sp echo {} :: 1 2 3
4261 1$ parallel echo {} ::: 1 2 3
4263 2$ sp echo {} {} :: 1 2 3
4265 2$ parallel echo {} {} :: 1 2 3
4267 3$ sp echo 1 :: echo 2 :: echo 3
4269 3$ parallel ::: 'echo 1' 'echo 2' 'echo 3'
4271 4$ sp a foo bar :: "b 'baz bar'" :: c
4273 4$ parallel ::: 'a foo bar' "b 'baz bar'" :: c
4275 https://github.com/SergioBenitez/sp
4276 (Last checked: 2023-10)
4279 =head2 Todo
4281 https://github.com/justanhduc/task-spooler
4283 https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html
4285 https://www.npmjs.com/package/concurrently
4287 http://code.google.com/p/push/ (cannot compile)
4289 https://github.com/krashanoff/parallel
4291 https://github.com/Nukesor/pueue
4293 https://arxiv.org/pdf/2012.15443.pdf KumQuat
4295 https://github.com/JeiKeiLim/simple_distribute_job
4297 https://github.com/reggi/pkgrun - not obvious how to use
4299 https://github.com/benoror/better-npm-run - not obvious how to use
4301 https://github.com/bahmutov/with-package
4303 https://github.com/flesler/parallel
4305 https://github.com/Julian/Verge
4307 https://vicerveza.homeunix.net/~viric/soft/ts/
4309 https://github.com/chapmanjacobd/que
4313 =head1 TESTING OTHER TOOLS
4315 There are certain issues that are very common on parallelizing
4316 tools. Here are a few stress tests. Be warned: If the tool is badly
4317 coded it may overload your machine.
4320 =head2 MIX: Output mixes
4322 Output from 2 jobs should not mix. If the output is not used, this
4323 does not matter; but if the output I<is> used then it is important
4324 that you do not get half a line from one job followed by half a line
4325 from another job.
4327 If the tool does not buffer, output will most likely mix now and then.
4329 This test stresses whether output mixes.
4331 #!/bin/bash
4333 paralleltool="parallel -j 30"
4335 cat <<-EOF > mycommand
4336 #!/bin/bash
4338 # If a, b, c, d, e, and f mix: Very bad
4339 perl -e 'print STDOUT "a"x3000_000," "'
4340 perl -e 'print STDERR "b"x3000_000," "'
4341 perl -e 'print STDOUT "c"x3000_000," "'
4342 perl -e 'print STDERR "d"x3000_000," "'
4343 perl -e 'print STDOUT "e"x3000_000," "'
4344 perl -e 'print STDERR "f"x3000_000," "'
4345 echo
4346 echo >&2
4348 chmod +x mycommand
4350 # Run 30 jobs in parallel
4351 seq 30 |
4352 $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
4354 # 'a c e' and 'b d f' should always stay together
4355 # and there should only be a single line per job
4358 =head2 STDERRMERGE: Stderr is merged with stdout
4360 Output from stdout and stderr should not be merged, but kept separated.
4362 This test shows whether stdout is mixed with stderr.
4364 #!/bin/bash
4366 paralleltool="parallel -j0"
4368 cat <<-EOF > mycommand
4369 #!/bin/bash
4371 echo stdout
4372 echo stderr >&2
4373 echo stdout
4374 echo stderr >&2
4376 chmod +x mycommand
4378 # Run one job
4379 echo |
4380 $paralleltool ./mycommand > stdout 2> stderr
4381 cat stdout
4382 cat stderr
4385 =head2 RAM: Output limited by RAM
4387 Some tools cache output in RAM. This makes them extremely slow if the
4388 output is bigger than physical memory and crash if the output is
4389 bigger than the virtual memory.
4391 #!/bin/bash
4393 paralleltool="parallel -j0"
4395 cat <<'EOF' > mycommand
4396 #!/bin/bash
4398 # Generate 1 GB output
4399 yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
4401 chmod +x mycommand
4403 # Run 20 jobs in parallel
4404 # Adjust 20 to be > physical RAM and < free space on /tmp
4405 seq 20 | time $paralleltool ./mycommand | wc -c
4408 =head2 DISKFULL: Incomplete data if /tmp runs full
4410 If caching is done on disk, the disk can run full during the run. Not
4411 all programs discover this. GNU Parallel discovers it, if it stays
4412 full for at least 2 seconds.
4414 #!/bin/bash
4416 paralleltool="parallel -j0"
4418 # This should be a dir with less than 100 GB free space
4419 smalldisk=/tmp/shm/parallel
4421 TMPDIR="$smalldisk"
4422 export TMPDIR
4424 max_output() {
4425 # Force worst case scenario:
4426 # Make GNU Parallel only check once per second
4427 sleep 10
4428 # Generate 100 GB to fill $TMPDIR
4429 # Adjust if /tmp is bigger than 100 GB
4430 yes | head -c 100G >$TMPDIR/$$
4431 # Generate 10 MB output that will not be buffered
4432 # due to full disk
4433 perl -e 'print "X"x10_000_000' | head -c 10M
4434 echo This part is missing from incomplete output
4435 sleep 2
4436 rm $TMPDIR/$$
4437 echo Final output
4440 export -f max_output
4441 seq 10 | $paralleltool max_output | tr -s X
4444 =head2 CLEANUP: Leaving tmp files at unexpected death
4446 Some tools do not clean up tmp files if they are killed. If the tool
4447 buffers on disk, they may not clean up, if they are killed.
4449 #!/bin/bash
4451 paralleltool=parallel
4453 ls /tmp >/tmp/before
4454 seq 10 | $paralleltool sleep &
4455 pid=$!
4456 # Give the tool time to start up
4457 sleep 1
4458 # Kill it without giving it a chance to cleanup
4459 kill -9 $!
4460 # Should be empty: No files should be left behind
4461 diff <(ls /tmp) /tmp/before
4464 =head2 SPCCHAR: Dealing badly with special file names.
4466 It is not uncommon for users to create files like:
4468 My brother's 12" *** record (costs $$$).jpg
4470 Some tools break on this.
4472 #!/bin/bash
4474 paralleltool=parallel
4476 touch "My brother's 12\" *** record (costs \$\$\$).jpg"
4477 ls My*jpg | $paralleltool ls -l
4480 =head2 COMPOSED: Composed commands do not work
4482 Some tools require you to wrap composed commands into B<bash -c>.
4484 echo bar | $paralleltool echo foo';' echo {}
4487 =head2 ONEREP: Only one replacement string allowed
4489 Some tools can only insert the argument once.
4491 echo bar | $paralleltool echo {} foo {}
4494 =head2 INPUTSIZE: Length of input should not be limited
4496 Some tools limit the length of the input lines artificially with no good
4497 reason. GNU B<parallel> does not:
4499 perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
4501 GNU B<parallel> limits the command to run to 128 KB due to execve(1):
4503 perl -e 'print "x"x131_000' | parallel echo {} | wc
4506 =head2 NUMWORDS: Speed depends on number of words
4508 Some tools become very slow if output lines have many words.
4510 #!/bin/bash
4512 paralleltool=parallel
4514 cat <<-EOF > mycommand
4515 #!/bin/bash
4517 # 10 MB of lines with 1000 words
4518 yes "`seq 1000`" | head -c 10M
4520 chmod +x mycommand
4522 # Run 30 jobs in parallel
4523 seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
4525 =head2 4GB: Output with a line > 4GB should be OK
4527 #!/bin/bash
4529 paralleltool="parallel -j0"
4531 cat <<-EOF > mycommand
4532 #!/bin/bash
4534 perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
4536 chmod +x mycommand
4538 # Run 1 job
4539 seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
4542 =head1 AUTHOR
4544 When using GNU B<parallel> for a publication please cite:
4546 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4547 The USENIX Magazine, February 2011:42-47.
4549 This helps funding further development; and it won't cost you a cent.
4550 If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4552 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4554 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4556 Copyright (C) 2010-2023 Ole Tange, http://ole.tange.dk and Free
4557 Software Foundation, Inc.
4559 Parts of the manual concerning B<xargs> compatibility is inspired by
4560 the manual of B<xargs> from GNU findutils 4.4.2.
4563 =head1 LICENSE
4565 This program is free software; you can redistribute it and/or modify
4566 it under the terms of the GNU General Public License as published by
4567 the Free Software Foundation; either version 3 of the License, or
4568 at your option any later version.
4570 This program is distributed in the hope that it will be useful,
4571 but WITHOUT ANY WARRANTY; without even the implied warranty of
4572 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4573 GNU General Public License for more details.
4575 You should have received a copy of the GNU General Public License
4576 along with this program. If not, see <https://www.gnu.org/licenses/>.
4578 =head2 Documentation license I
4580 Permission is granted to copy, distribute and/or modify this
4581 documentation under the terms of the GNU Free Documentation License,
4582 Version 1.3 or any later version published by the Free Software
4583 Foundation; with no Invariant Sections, with no Front-Cover Texts, and
4584 with no Back-Cover Texts. A copy of the license is included in the
4585 file LICENSES/GFDL-1.3-or-later.txt.
4587 =head2 Documentation license II
4589 You are free:
4591 =over 9
4593 =item B<to Share>
4595 to copy, distribute and transmit the work
4597 =item B<to Remix>
4599 to adapt the work
4601 =back
4603 Under the following conditions:
4605 =over 9
4607 =item B<Attribution>
4609 You must attribute the work in the manner specified by the author or
4610 licensor (but not in any way that suggests that they endorse you or
4611 your use of the work).
4613 =item B<Share Alike>
4615 If you alter, transform, or build upon this work, you may distribute
4616 the resulting work only under the same, similar or a compatible
4617 license.
4619 =back
4621 With the understanding that:
4623 =over 9
4625 =item B<Waiver>
4627 Any of the above conditions can be waived if you get permission from
4628 the copyright holder.
4630 =item B<Public Domain>
4632 Where the work or any of its elements is in the public domain under
4633 applicable law, that status is in no way affected by the license.
4635 =item B<Other Rights>
4637 In no way are any of the following rights affected by the license:
4639 =over 2
4641 =item *
4643 Your fair dealing or fair use rights, or other applicable
4644 copyright exceptions and limitations;
4646 =item *
4648 The author's moral rights;
4650 =item *
4652 Rights other persons may have either in the work itself or in
4653 how the work is used, such as publicity or privacy rights.
4655 =back
4657 =back
4659 =over 9
4661 =item B<Notice>
4663 For any reuse or distribution, you must make clear to others the
4664 license terms of this work.
4666 =back
4668 A copy of the full license is included in the file as
4669 LICENCES/CC-BY-SA-4.0.txt
4672 =head1 DEPENDENCIES
4674 GNU B<parallel> uses Perl, and the Perl modules Getopt::Long,
4675 IPC::Open3, Symbol, IO::File, POSIX, and File::Temp. For remote usage
4676 it also uses rsync with ssh.
4679 =head1 SEE ALSO
4681 B<find>(1), B<xargs>(1), B<make>(1), B<pexec>(1), B<ppss>(1),
4682 B<xjobs>(1), B<prll>(1), B<dxargs>(1), B<mdm>(1)
4684 =cut