Released as 20240522 ('Tbilisi')
[parallel.git] / src / parallel_alternatives.pod
bloba6168fad815d47c00086fd03252986f740b208ad
1 #!/usr/bin/perl -w
3 # SPDX-FileCopyrightText: 2021-2024 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
4 # SPDX-License-Identifier: GFDL-1.3-or-later
5 # SPDX-License-Identifier: CC-BY-SA-4.0
7 =encoding utf8
9 =head1 NAME
11 parallel_alternatives - Alternatives to GNU B<parallel>
14 =head1 DIFFERENCES BETWEEN GNU Parallel AND ALTERNATIVES
16 There are a lot programs that share functionality with GNU
17 B<parallel>. Some of these are specialized tools, and while GNU
18 B<parallel> can emulate many of them, a specialized tool can be better
19 at a given task. GNU B<parallel> strives to include the best of the
20 general functionality without sacrificing ease of use.
22 B<parallel> has existed since 2002-01-06 and as GNU B<parallel> since
23 2010. A lot of the alternatives have not had the vitality to survive
24 that long, but have come and gone during that time.
26 GNU B<parallel> is actively maintained with a new release every month
27 since 2010. Most other alternatives are fleeting interests of the
28 developers with irregular releases and only maintained for a few
29 years.
32 =head2 SUMMARY LEGEND
34 The following features are in some of the comparable tools:
36 =head3 Inputs
38 =over
40 =item I1. Arguments can be read from stdin
42 =item I2. Arguments can be read from a file
44 =item I3. Arguments can be read from multiple files
46 =item I4. Arguments can be read from command line
48 =item I5. Arguments can be read from a table
50 =item I6. Arguments can be read from the same file using #! (shebang)
52 =item I7. Line oriented input as default (Quoting of special chars not needed)
54 =back
57 =head3 Manipulation of input
59 =over
61 =item M1. Composed command
63 =item M2. Multiple arguments can fill up an execution line
65 =item M3. Arguments can be put anywhere in the execution line
67 =item M4. Multiple arguments can be put anywhere in the execution line
69 =item M5. Arguments can be replaced with context
71 =item M6. Input can be treated as the complete command line
73 =back
76 =head3 Outputs
78 =over
80 =item O1. Grouping output so output from different jobs do not mix
82 =item O2. Send stderr (standard error) to stderr (standard error)
84 =item O3. Send stdout (standard output) to stdout (standard output)
86 =item O4. Order of output can be same as order of input
88 =item O5. Stdout only contains stdout (standard output) from the command
90 =item O6. Stderr only contains stderr (standard error) from the command
92 =item O7. Buffering on disk
94 =item O8. No temporary files left if killed
96 =item O9. Test if disk runs full during run
98 =item O10. Output of a line bigger than 4 GB
100 =back
103 =head3 Execution
105 =over
107 =item E1. Run jobs in parallel
109 =item E2. List running jobs
111 =item E3. Finish running jobs, but do not start new jobs
113 =item E4. Number of running jobs can depend on number of cpus
115 =item E5. Finish running jobs, but do not start new jobs after first failure
117 =item E6. Number of running jobs can be adjusted while running
119 =item E7. Only spawn new jobs if load is less than a limit
121 =back
124 =head3 Remote execution
126 =over
128 =item R1. Jobs can be run on remote computers
130 =item R2. Basefiles can be transferred
132 =item R3. Argument files can be transferred
134 =item R4. Result files can be transferred
136 =item R5. Cleanup of transferred files
138 =item R6. No config files needed
140 =item R7. Do not run more than SSHD's MaxStartups can handle
142 =item R8. Configurable SSH command
144 =item R9. Retry if connection breaks occasionally
146 =back
149 =head3 Semaphore
151 =over
153 =item S1. Possibility to work as a mutex
155 =item S2. Possibility to work as a counting semaphore
157 =back
160 =head3 Legend
162 =over
164 =item - = no
166 =item x = not applicable
168 =item ID = yes
170 =back
172 As every new version of the programs are not tested the table may be
173 outdated. Please file a bug report if you find errors (See REPORTING
174 BUGS).
176 parallel:
178 =over
180 =item I1 I2 I3 I4 I5 I6 I7
182 =item M1 M2 M3 M4 M5 M6
184 =item O1 O2 O3 O4 O5 O6 O7 O8 O9 O10
186 =item E1 E2 E3 E4 E5 E6 E7
188 =item R1 R2 R3 R4 R5 R6 R7 R8 R9
190 =item S1 S2
192 =back
195 =head2 DIFFERENCES BETWEEN xargs AND GNU Parallel
197 Summary (see legend above):
199 =over
201 =item I1 I2 - - - - -
203 =item - M2 M3 - - -
205 =item - O2 O3 - O5 O6
207 =item E1 - - - - - -
209 =item - - - - - x - - -
211 =item - -
213 =back
215 B<xargs> offers some of the same possibilities as GNU B<parallel>.
217 B<xargs> deals badly with special characters (such as space, \, ' and
218 "). To see the problem try this:
220 touch important_file
221 touch 'not important_file'
222 ls not* | xargs rm
223 mkdir -p "My brother's 12\" records"
224 ls | xargs rmdir
225 touch 'c:\windows\system32\clfs.sys'
226 echo 'c:\windows\system32\clfs.sys' | xargs ls -l
228 You can specify B<-0>, but many input generators are not optimized for
229 using B<NUL> as separator but are optimized for B<newline> as
230 separator. E.g. B<awk>, B<ls>, B<echo>, B<tar -v>, B<head> (requires
231 using B<-z>), B<tail> (requires using B<-z>), B<sed> (requires using
232 B<-z>), B<perl> (B<-0> and \0 instead of \n), B<locate> (requires
233 using B<-0>), B<find> (requires using B<-print0>), B<grep> (requires
234 using B<-z> or B<-Z>), B<sort> (requires using B<-z>).
236 GNU B<parallel>'s newline separation can be emulated with:
238 cat | xargs -d "\n" -n1 command
240 B<xargs> can run a given number of jobs in parallel, but has no
241 support for running number-of-cpu-cores jobs in parallel.
243 B<xargs> has no support for grouping the output, therefore output may
244 run together, e.g. the first half of a line is from one process and
245 the last half of the line is from another process. The example
246 B<Parallel grep> cannot be done reliably with B<xargs> because of
247 this. To see this in action try:
249 parallel perl -e "'"'$a="1"."{}"x10000000;print $a,"\n"'"'" \
250 '>' {} ::: a b c d e f g h
251 # Serial = no mixing = the wanted result
252 # 'tr -s a-z' squeezes repeating letters into a single letter
253 echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
254 # Compare to 8 jobs in parallel
255 parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
256 echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
257 echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
258 tr -s a-z
260 Or try this:
262 slow_seq() {
263 echo Count to "$@"
264 seq "$@" |
265 perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
267 export -f slow_seq
268 # Serial = no mixing = the wanted result
269 seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
270 # Compare to 8 jobs in parallel
271 seq 8 | parallel -P8 slow_seq {}
272 seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
274 B<xargs> has no support for keeping the order of the output, therefore
275 if running jobs in parallel using B<xargs> the output of the second
276 job cannot be postponed till the first job is done.
278 B<xargs> has no support for running jobs on remote computers.
280 B<xargs> has no support for context replace, so you will have to create the
281 arguments.
283 If you use a replace string in B<xargs> (B<-I>) you can not force
284 B<xargs> to use more than one argument.
286 Quoting in B<xargs> works like B<-q> in GNU B<parallel>. This means
287 composed commands and redirection require using B<bash -c>.
289 ls | parallel "wc {} >{}.wc"
290 ls | parallel "echo {}; ls {}|wc"
292 becomes (assuming you have 8 cores and that none of the filenames
293 contain space, " or ').
295 ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
296 ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
298 A more extreme example can be found on:
299 https://unix.stackexchange.com/q/405552/
301 https://www.gnu.org/software/findutils/
304 =head2 DIFFERENCES BETWEEN find -exec AND GNU Parallel
306 Summary (see legend above):
308 =over
310 =item - - - x - x -
312 =item - M2 M3 - - - -
314 =item - O2 O3 O4 O5 O6
316 =item - - - - - - -
318 =item - - - - - - - - -
320 =item x x
322 =back
324 B<find -exec> offers some of the same possibilities as GNU B<parallel>.
326 B<find -exec> only works on files. Processing other input (such as
327 hosts or URLs) will require creating these inputs as files. B<find
328 -exec> has no support for running commands in parallel.
330 https://www.gnu.org/software/findutils/
331 (Last checked: 2019-01)
334 =head2 DIFFERENCES BETWEEN make -j AND GNU Parallel
336 Summary (see legend above):
338 =over
340 =item - - - - - - -
342 =item - - - - - -
344 =item O1 O2 O3 - x O6
346 =item E1 - - - E5 -
348 =item - - - - - - - - -
350 =item - -
352 =back
354 B<make -j> can run jobs in parallel, but requires a crafted Makefile
355 to do this. That results in extra quoting to get filenames containing
356 newlines to work correctly.
358 B<make -j> computes a dependency graph before running jobs. Jobs run
359 by GNU B<parallel> does not depend on each other.
361 (Very early versions of GNU B<parallel> were coincidentally implemented
362 using B<make -j>).
364 https://www.gnu.org/software/make/
365 (Last checked: 2019-01)
368 =head2 DIFFERENCES BETWEEN ppss AND GNU Parallel
370 Summary (see legend above):
372 =over
374 =item I1 I2 - - - - I7
376 =item M1 - M3 - - M6
378 =item O1 - - x - -
380 =item E1 E2 ?E3 E4 - - -
382 =item R1 R2 R3 R4 - - ?R7 ? ?
384 =item - -
386 =back
388 B<ppss> is also a tool for running jobs in parallel.
390 The output of B<ppss> is status information and thus not useful for
391 using as input for another command. The output from the jobs are put
392 into files.
394 The argument replace string ($ITEM) cannot be changed. Arguments must
395 be quoted - thus arguments containing special characters (space '"&!*)
396 may cause problems. More than one argument is not supported. Filenames
397 containing newlines are not processed correctly. When reading input
398 from a file null cannot be used as a terminator. B<ppss> needs to read
399 the whole input file before starting any jobs.
401 Output and status information is stored in ppss_dir and thus requires
402 cleanup when completed. If the dir is not removed before running
403 B<ppss> again it may cause nothing to happen as B<ppss> thinks the
404 task is already done. GNU B<parallel> will normally not need cleaning
405 up if running locally and will only need cleaning up if stopped
406 abnormally and running remote (B<--cleanup> may not complete if
407 stopped abnormally). The example B<Parallel grep> would require extra
408 postprocessing if written using B<ppss>.
410 For remote systems PPSS requires 3 steps: config, deploy, and
411 start. GNU B<parallel> only requires one step.
413 =head3 EXAMPLES FROM ppss MANUAL
415 Here are the examples from B<ppss>'s manual page with the equivalent
416 using GNU B<parallel>:
418 1$ ./ppss.sh standalone -d /path/to/files -c 'gzip '
420 1$ find /path/to/files -type f | parallel gzip
422 2$ ./ppss.sh standalone -d /path/to/files \
423 -c 'cp "$ITEM" /destination/dir '
425 2$ find /path/to/files -type f | parallel cp {} /destination/dir
427 3$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q '
429 3$ parallel -a list-of-urls.txt wget -q
431 4$ ./ppss.sh standalone -f list-of-urls.txt -c 'wget -q "$ITEM"'
433 4$ parallel -a list-of-urls.txt wget -q {}
435 5$ ./ppss config -C config.cfg -c 'encode.sh ' -d /source/dir \
436 -m 192.168.1.100 -u ppss -k ppss-key.key -S ./encode.sh \
437 -n nodes.txt -o /some/output/dir --upload --download;
438 ./ppss deploy -C config.cfg
439 ./ppss start -C config
441 5$ # parallel does not use configs. If you want
442 # a different username put it in nodes.txt: user@hostname
443 find source/dir -type f |
444 parallel --sshloginfile nodes.txt --trc {.}.mp3 \
445 lame -a {} -o {.}.mp3 --preset standard --quiet
447 6$ ./ppss stop -C config.cfg
449 6$ killall -TERM parallel
451 7$ ./ppss pause -C config.cfg
453 7$ Press: CTRL-Z or killall -SIGTSTP parallel
455 8$ ./ppss continue -C config.cfg
457 8$ Enter: fg or killall -SIGCONT parallel
459 9$ ./ppss.sh status -C config.cfg
461 9$ killall -SIGUSR2 parallel
463 https://github.com/louwrentius/PPSS
464 (Last checked: 2010-12)
467 =head2 DIFFERENCES BETWEEN pexec AND GNU Parallel
469 Summary (see legend above):
471 =over
473 =item I1 I2 - I4 I5 - -
475 =item M1 - M3 - - M6
477 =item O1 O2 O3 - O5 O6
479 =item E1 - - E4 - E6 -
481 =item R1 - - - - R6 - - -
483 =item S1 -
485 =back
487 B<pexec> is also a tool for running jobs in parallel.
489 =head3 EXAMPLES FROM pexec MANUAL
491 Here are the examples from B<pexec>'s info page with the equivalent
492 using GNU B<parallel>:
494 1$ pexec -o sqrt-%s.dat -p "$(seq 10)" -e NUM -n 4 -c -- \
495 'echo "scale=10000;sqrt($NUM)" | bc'
497 1$ seq 10 | parallel -j4 'echo "scale=10000;sqrt({})" | \
498 bc > sqrt-{}.dat'
500 2$ pexec -p "$(ls myfiles*.ext)" -i %s -o %s.sort -- sort
502 2$ ls myfiles*.ext | parallel sort {} ">{}.sort"
504 3$ pexec -f image.list -n auto -e B -u star.log -c -- \
505 'fistar $B.fits -f 100 -F id,x,y,flux -o $B.star'
507 3$ parallel -a image.list \
508 'fistar {}.fits -f 100 -F id,x,y,flux -o {}.star' 2>star.log
510 4$ pexec -r *.png -e IMG -c -o - -- \
511 'convert $IMG ${IMG%.png}.jpeg ; "echo $IMG: done"'
513 4$ ls *.png | parallel 'convert {} {.}.jpeg; echo {}: done'
515 5$ pexec -r *.png -i %s -o %s.jpg -c 'pngtopnm | pnmtojpeg'
517 5$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {}.jpg'
519 6$ for p in *.png ; do echo ${p%.png} ; done | \
520 pexec -f - -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
522 6$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
524 7$ LIST=$(for p in *.png ; do echo ${p%.png} ; done)
525 pexec -r $LIST -i %s.png -o %s.jpg -c 'pngtopnm | pnmtojpeg'
527 7$ ls *.png | parallel 'pngtopnm < {} | pnmtojpeg > {.}.jpg'
529 8$ pexec -n 8 -r *.jpg -y unix -e IMG -c \
530 'pexec -j -m blockread -d $IMG | \
531 jpegtopnm | pnmscale 0.5 | pnmtojpeg | \
532 pexec -j -m blockwrite -s th_$IMG'
534 8$ # Combining GNU B<parallel> and GNU B<sem>.
535 ls *jpg | parallel -j8 'sem --id blockread cat {} | jpegtopnm |' \
536 'pnmscale 0.5 | pnmtojpeg | sem --id blockwrite cat > th_{}'
538 # If reading and writing is done to the same disk, this may be
539 # faster as only one process will be either reading or writing:
540 ls *jpg | parallel -j8 'sem --id diskio cat {} | jpegtopnm |' \
541 'pnmscale 0.5 | pnmtojpeg | sem --id diskio cat > th_{}'
543 https://www.gnu.org/software/pexec/
544 (Last checked: 2010-12)
547 =head2 DIFFERENCES BETWEEN xjobs AND GNU Parallel
549 B<xjobs> is also a tool for running jobs in parallel. It only supports
550 running jobs on your local computer.
552 B<xjobs> deals badly with special characters just like B<xargs>. See
553 the section B<DIFFERENCES BETWEEN xargs AND GNU Parallel>.
555 =head3 EXAMPLES FROM xjobs MANUAL
557 Here are the examples from B<xjobs>'s man page with the equivalent
558 using GNU B<parallel>:
560 1$ ls -1 *.zip | xjobs unzip
562 1$ ls *.zip | parallel unzip
564 2$ ls -1 *.zip | xjobs -n unzip
566 2$ ls *.zip | parallel unzip >/dev/null
568 3$ find . -name '*.bak' | xjobs gzip
570 3$ find . -name '*.bak' | parallel gzip
572 4$ ls -1 *.jar | sed 's/\(.*\)/\1 > \1.idx/' | xjobs jar tf
574 4$ ls *.jar | parallel jar tf {} '>' {}.idx
576 5$ xjobs -s script
578 5$ cat script | parallel
580 6$ mkfifo /var/run/my_named_pipe;
581 xjobs -s /var/run/my_named_pipe &
582 echo unzip 1.zip >> /var/run/my_named_pipe;
583 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
585 6$ mkfifo /var/run/my_named_pipe;
586 cat /var/run/my_named_pipe | parallel &
587 echo unzip 1.zip >> /var/run/my_named_pipe;
588 echo tar cf /backup/myhome.tar /home/me >> /var/run/my_named_pipe
590 https://www.maier-komor.de/xjobs.html
591 (Last checked: 2019-01)
594 =head2 DIFFERENCES BETWEEN prll AND GNU Parallel
596 B<prll> is also a tool for running jobs in parallel. It does not
597 support running jobs on remote computers.
599 B<prll> encourages using BASH aliases and BASH functions instead of
600 scripts. GNU B<parallel> supports scripts directly, functions if they
601 are exported using B<export -f>, and aliases if using B<env_parallel>.
603 B<prll> generates a lot of status information on stderr (standard
604 error) which makes it harder to use the stderr (standard error) output
605 of the job directly as input for another program.
607 =head3 EXAMPLES FROM prll's MANUAL
609 Here is the example from B<prll>'s man page with the equivalent
610 using GNU B<parallel>:
612 1$ prll -s 'mogrify -flip $1' *.jpg
614 1$ parallel mogrify -flip ::: *.jpg
616 https://github.com/exzombie/prll
617 (Last checked: 2019-01)
620 =head2 DIFFERENCES BETWEEN dxargs AND GNU Parallel
622 B<dxargs> is also a tool for running jobs in parallel.
624 B<dxargs> does not deal well with more simultaneous jobs than SSHD's
625 MaxStartups. B<dxargs> is only built for remote run jobs, but does not
626 support transferring of files.
628 https://web.archive.org/web/20120518070250/http://www.
629 semicomplete.com/blog/geekery/distributed-xargs.html
630 (Last checked: 2019-01)
633 =head2 DIFFERENCES BETWEEN mdm/middleman AND GNU Parallel
635 middleman(mdm) is also a tool for running jobs in parallel.
637 =head3 EXAMPLES FROM middleman's WEBSITE
639 Here are the shellscripts of
640 https://web.archive.org/web/20110728064735/http://mdm.
641 berlios.de/usage.html ported to GNU B<parallel>:
643 1$ seq 19 | parallel buffon -o - | sort -n > result
644 cat files | parallel cmd
645 find dir -execdir sem cmd {} \;
647 https://github.com/cklin/mdm
648 (Last checked: 2019-01)
651 =head2 DIFFERENCES BETWEEN xapply AND GNU Parallel
653 B<xapply> can run jobs in parallel on the local computer.
655 =head3 EXAMPLES FROM xapply's MANUAL
657 Here are the examples from B<xapply>'s man page with the equivalent
658 using GNU B<parallel>:
660 1$ xapply '(cd %1 && make all)' */
662 1$ parallel 'cd {} && make all' ::: */
664 2$ xapply -f 'diff %1 ../version5/%1' manifest | more
666 2$ parallel diff {} ../version5/{} < manifest | more
668 3$ xapply -p/dev/null -f 'diff %1 %2' manifest1 checklist1
670 3$ parallel --link diff {1} {2} :::: manifest1 checklist1
672 4$ xapply 'indent' *.c
674 4$ parallel indent ::: *.c
676 5$ find ~ksb/bin -type f ! -perm -111 -print | \
677 xapply -f -v 'chmod a+x' -
679 5$ find ~ksb/bin -type f ! -perm -111 -print | \
680 parallel -v chmod a+x
682 6$ find */ -... | fmt 960 1024 | xapply -f -i /dev/tty 'vi' -
684 6$ sh <(find */ -... | parallel -s 1024 echo vi)
686 6$ find */ -... | parallel -s 1024 -Xuj1 vi
688 7$ find ... | xapply -f -5 -i /dev/tty 'vi' - - - - -
690 7$ sh <(find ... | parallel -n5 echo vi)
692 7$ find ... | parallel -n5 -uj1 vi
694 8$ xapply -fn "" /etc/passwd
696 8$ parallel -k echo < /etc/passwd
698 9$ tr ':' '\012' < /etc/passwd | \
699 xapply -7 -nf 'chown %1 %6' - - - - - - -
701 9$ tr ':' '\012' < /etc/passwd | parallel -N7 chown {1} {6}
703 10$ xapply '[ -d %1/RCS ] || echo %1' */
705 10$ parallel '[ -d {}/RCS ] || echo {}' ::: */
707 11$ xapply -f '[ -f %1 ] && echo %1' List | ...
709 11$ parallel '[ -f {} ] && echo {}' < List | ...
711 https://www.databits.net/~ksb/msrc/local/bin/xapply/xapply.html (Last
712 checked: 2010-12)
715 =head2 DIFFERENCES BETWEEN AIX apply AND GNU Parallel
717 B<apply> can build command lines based on a template and arguments -
718 very much like GNU B<parallel>. B<apply> does not run jobs in
719 parallel. B<apply> does not use an argument separator (like B<:::>);
720 instead the template must be the first argument.
722 =head3 EXAMPLES FROM IBM's KNOWLEDGE CENTER
724 Here are the examples from IBM's Knowledge Center and the
725 corresponding command using GNU B<parallel>:
727 =head4 To obtain results similar to those of the B<ls> command, enter:
729 1$ apply echo *
730 1$ parallel echo ::: *
732 =head4 To compare the file named a1 to the file named b1, and
733 the file named a2 to the file named b2, enter:
735 2$ apply -2 cmp a1 b1 a2 b2
736 2$ parallel -N2 cmp ::: a1 b1 a2 b2
738 =head4 To run the B<who> command five times, enter:
740 3$ apply -0 who 1 2 3 4 5
741 3$ parallel -N0 who ::: 1 2 3 4 5
743 =head4 To link all files in the current directory to the directory
744 /usr/joe, enter:
746 4$ apply 'ln %1 /usr/joe' *
747 4$ parallel ln {} /usr/joe ::: *
749 https://www-01.ibm.com/support/knowledgecenter/
750 ssw_aix_71/com.ibm.aix.cmds1/apply.htm
751 (Last checked: 2019-01)
754 =head2 DIFFERENCES BETWEEN paexec AND GNU Parallel
756 B<paexec> can run jobs in parallel on both the local and remote computers.
758 B<paexec> requires commands to print a blank line as the last
759 output. This means you will have to write a wrapper for most programs.
761 B<paexec> has a job dependency facility so a job can depend on another
762 job to be executed successfully. Sort of a poor-man's B<make>.
764 =head3 EXAMPLES FROM paexec's EXAMPLE CATALOG
766 Here are the examples from B<paexec>'s example catalog with the equivalent
767 using GNU B<parallel>:
769 =head4 1_div_X_run
771 1$ ../../paexec -s -l -c "`pwd`/1_div_X_cmd" -n +1 <<EOF [...]
773 1$ parallel echo {} '|' `pwd`/1_div_X_cmd <<EOF [...]
775 =head4 all_substr_run
777 2$ ../../paexec -lp -c "`pwd`/all_substr_cmd" -n +3 <<EOF [...]
779 2$ parallel echo {} '|' `pwd`/all_substr_cmd <<EOF [...]
781 =head4 cc_wrapper_run
783 3$ ../../paexec -c "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
784 -n 'host1 host2' \
785 -t '/usr/bin/ssh -x' <<EOF [...]
787 3$ parallel echo {} '|' "env CC=gcc CFLAGS=-O2 `pwd`/cc_wrapper_cmd" \
788 -S host1,host2 <<EOF [...]
790 # This is not exactly the same, but avoids the wrapper
791 parallel gcc -O2 -c -o {.}.o {} \
792 -S host1,host2 <<EOF [...]
794 =head4 toupper_run
796 4$ ../../paexec -lp -c "`pwd`/toupper_cmd" -n +10 <<EOF [...]
798 4$ parallel echo {} '|' ./toupper_cmd <<EOF [...]
800 # Without the wrapper:
801 parallel echo {} '| awk {print\ toupper\(\$0\)}' <<EOF [...]
803 https://github.com/cheusov/paexec
804 (Last checked: 2010-12)
807 =head2 DIFFERENCES BETWEEN map(sitaramc) AND GNU Parallel
809 Summary (see legend above):
811 =over
813 =item I1 - - I4 - - (I7)
815 =item M1 (M2) M3 (M4) M5 M6
817 =item - O2 O3 - O5 - - x x O10
819 =item E1 - - - - - -
821 =item - - - - - - - - -
823 =item - -
825 =back
827 (I7): Only under special circumstances. See below.
829 (M2+M4): Only if there is a single replacement string.
831 B<map> rejects input with special characters:
833 echo "The Cure" > My\ brother\'s\ 12\"\ records
835 ls | map 'echo %; wc %'
837 It works with GNU B<parallel>:
839 ls | parallel 'echo {}; wc {}'
841 Under some circumstances it also works with B<map>:
843 ls | map 'echo % works %'
845 But tiny changes make it reject the input with special characters:
847 ls | map 'echo % does not work "%"'
849 This means that many UTF-8 characters will be rejected. This is by
850 design. From the web page: "As such, programs that I<quietly handle
851 them, with no warnings at all,> are doing their users a disservice."
853 B<map> delays each job by 0.01 s. This can be emulated by using
854 B<parallel --delay 0.01>.
856 B<map> prints '+' on stderr when a job starts, and '-' when a job
857 finishes. This cannot be disabled. B<parallel> has B<--bar> if you
858 need to see progress.
860 B<map>'s replacement strings (% %D %B %E) can be simulated in GNU
861 B<parallel> by putting this in B<~/.parallel/config>:
863 --rpl '%'
864 --rpl '%D $_=Q(::dirname($_));'
865 --rpl '%B s:.*/::;s:\.[^/.]+$::;'
866 --rpl '%E s:.*\.::'
868 B<map> does not have an argument separator on the command line, but
869 uses the first argument as command. This makes quoting harder which again
870 may affect readability. Compare:
872 map -p 2 'perl -ne '"'"'/^\S+\s+\S+$/ and print $ARGV,"\n"'"'" *
874 parallel -q perl -ne '/^\S+\s+\S+$/ and print $ARGV,"\n"' ::: *
876 B<map> can do multiple arguments with context replace, but not without
877 context replace:
879 parallel --xargs echo 'BEGIN{'{}'}END' ::: 1 2 3
881 map "echo 'BEGIN{'%'}END'" 1 2 3
883 B<map> has no support for grouping. So this gives the wrong results:
885 parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
886 ::: a b c d e f
887 ls -l a b c d e f
888 parallel -kP4 -n1 grep 1 ::: a b c d e f > out.par
889 map -n1 -p 4 'grep 1' a b c d e f > out.map-unbuf
890 map -n1 -p 4 'grep --line-buffered 1' a b c d e f > out.map-linebuf
891 map -n1 -p 1 'grep --line-buffered 1' a b c d e f > out.map-serial
892 ls -l out*
893 md5sum out*
895 =head3 EXAMPLES FROM map's WEBSITE
897 Here are the examples from B<map>'s web page with the equivalent using
898 GNU B<parallel>:
900 1$ ls *.gif | map convert % %B.png # default max-args: 1
902 1$ ls *.gif | parallel convert {} {.}.png
904 2$ map "mkdir %B; tar -C %B -xf %" *.tgz # default max-args: 1
906 2$ parallel 'mkdir {.}; tar -C {.} -xf {}' ::: *.tgz
908 3$ ls *.gif | map cp % /tmp # default max-args: 100
910 3$ ls *.gif | parallel -X cp {} /tmp
912 4$ ls *.tar | map -n 1 tar -xf %
914 4$ ls *.tar | parallel tar -xf
916 5$ map "cp % /tmp" *.tgz
918 5$ parallel cp {} /tmp ::: *.tgz
920 6$ map "du -sm /home/%/mail" alice bob carol
922 6$ parallel "du -sm /home/{}/mail" ::: alice bob carol
923 or if you prefer running a single job with multiple args:
924 6$ parallel -Xj1 "du -sm /home/{}/mail" ::: alice bob carol
926 7$ cat /etc/passwd | map -d: 'echo user %1 has shell %7'
928 7$ cat /etc/passwd | parallel --colsep : 'echo user {1} has shell {7}'
930 8$ export MAP_MAX_PROCS=$(( `nproc` / 2 ))
932 8$ export PARALLEL=-j50%
934 https://github.com/sitaramc/map
935 (Last checked: 2020-05)
938 =head2 DIFFERENCES BETWEEN ladon AND GNU Parallel
940 B<ladon> can run multiple jobs on files in parallel.
942 B<ladon> only works on files and the only way to specify files is
943 using a quoted glob string (such as \*.jpg). It is not possible to
944 list the files manually.
946 As replacement strings it uses FULLPATH DIRNAME BASENAME EXT RELDIR
947 RELPATH
949 These can be simulated using GNU B<parallel> by putting this in
950 B<~/.parallel/config>:
952 --rpl 'FULLPATH $_=Q($_);chomp($_=qx{readlink -f $_});'
953 --rpl 'DIRNAME $_=Q(::dirname($_));chomp($_=qx{readlink -f $_});'
954 --rpl 'BASENAME s:.*/::;s:\.[^/.]+$::;'
955 --rpl 'EXT s:.*\.::'
956 --rpl 'RELDIR $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
957 s:\Q$c/\E::;$_=::dirname($_);'
958 --rpl 'RELPATH $_=Q($_);chomp(($_,$c)=qx{readlink -f $_;pwd});
959 s:\Q$c/\E::;'
961 B<ladon> deals badly with filenames containing " and newline, and it
962 fails for output larger than 200k:
964 ladon '*' -- seq 36000 | wc
966 =head3 EXAMPLES FROM ladon MANUAL
968 It is assumed that the '--rpl's above are put in B<~/.parallel/config>
969 and that it is run under a shell that supports '**' globbing (such as B<zsh>):
971 1$ ladon "**/*.txt" -- echo RELPATH
973 1$ parallel echo RELPATH ::: **/*.txt
975 2$ ladon "~/Documents/**/*.pdf" -- shasum FULLPATH >hashes.txt
977 2$ parallel shasum FULLPATH ::: ~/Documents/**/*.pdf >hashes.txt
979 3$ ladon -m thumbs/RELDIR "**/*.jpg" -- convert FULLPATH \
980 -thumbnail 100x100^ -gravity center -extent 100x100 \
981 thumbs/RELPATH
983 3$ parallel mkdir -p thumbs/RELDIR\; convert FULLPATH
984 -thumbnail 100x100^ -gravity center -extent 100x100 \
985 thumbs/RELPATH ::: **/*.jpg
987 4$ ladon "~/Music/*.wav" -- lame -V 2 FULLPATH DIRNAME/BASENAME.mp3
989 4$ parallel lame -V 2 FULLPATH DIRNAME/BASENAME.mp3 ::: ~/Music/*.wav
991 https://github.com/danielgtaylor/ladon
992 (Last checked: 2019-01)
995 =head2 DIFFERENCES BETWEEN jobflow AND GNU Parallel
997 Summary (see legend above):
999 =over
1001 =item I1 - - - - - I7
1003 =item - - M3 - - (M6)
1005 =item O1 O2 O3 - O5 O6 (O7) - - O10
1007 =item E1 - - - - E6 -
1009 =item - - - - - - - - -
1011 =item - -
1013 =back
1016 B<jobflow> can run multiple jobs in parallel.
1018 Just like B<xargs> output from B<jobflow> jobs running in parallel mix
1019 together by default. B<jobflow> can buffer into files with
1020 B<-buffered> (placed in /run/shm), but these are not cleaned up if
1021 B<jobflow> dies unexpectedly (e.g. by Ctrl-C). If the total output is
1022 big (in the order of RAM+swap) it can cause the system to slow to a
1023 crawl and eventually run out of memory.
1025 Just like B<xargs> redirection and composed commands require wrapping
1026 with B<bash -c>.
1028 Input lines can at most be 4096 bytes.
1030 B<jobflow> is faster than GNU B<parallel> but around 6 times slower
1031 than B<parallel-bash>.
1033 B<jobflow> has no equivalent for B<--pipe>, or B<--sshlogin>.
1035 B<jobflow> makes it possible to set resource limits on the running
1036 jobs. This can be emulated by GNU B<parallel> using B<bash>'s B<ulimit>:
1038 jobflow -limits=mem=100M,cpu=3,fsize=20M,nofiles=300 myjob
1040 parallel 'ulimit -v 102400 -t 3 -f 204800 -n 300 myjob'
1043 =head3 EXAMPLES FROM jobflow README
1045 1$ cat things.list | jobflow -threads=8 -exec ./mytask {}
1047 1$ cat things.list | parallel -j8 ./mytask {}
1049 2$ seq 100 | jobflow -threads=100 -exec echo {}
1051 2$ seq 100 | parallel -j100 echo {}
1053 3$ cat urls.txt | jobflow -threads=32 -exec wget {}
1055 3$ cat urls.txt | parallel -j32 wget {}
1057 4$ find . -name '*.bmp' | \
1058 jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
1060 4$ find . -name '*.bmp' | \
1061 parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
1063 5$ seq 100 | jobflow -skip 10 -count 10
1065 5$ seq 100 | parallel --filter '{1} > 10 and {1} <= 20' echo
1067 5$ seq 100 | parallel echo '{= $_>10 and $_<=20 or skip() =}'
1069 https://github.com/rofl0r/jobflow
1070 (Last checked: 2022-05)
1073 =head2 DIFFERENCES BETWEEN gargs AND GNU Parallel
1075 B<gargs> can run multiple jobs in parallel.
1077 Older versions cache output in memory. This causes it to be extremely
1078 slow when the output is larger than the physical RAM, and can cause
1079 the system to run out of memory.
1081 See more details on this in B<man parallel_design>.
1083 Newer versions cache output in files, but leave files in $TMPDIR if it
1084 is killed.
1086 Output to stderr (standard error) is changed if the command fails.
1088 =head3 EXAMPLES FROM gargs WEBSITE
1090 1$ seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
1092 1$ seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
1094 2$ cat t.txt | gargs --sep "\s+" \
1095 -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
1097 2$ cat t.txt | parallel --colsep "\\s+" \
1098 -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
1100 https://github.com/brentp/gargs
1101 (Last checked: 2016-08)
1104 =head2 DIFFERENCES BETWEEN orgalorg AND GNU Parallel
1106 B<orgalorg> can run the same job on multiple machines. This is related
1107 to B<--onall> and B<--nonall>.
1109 B<orgalorg> supports entering the SSH password - provided it is the
1110 same for all servers. GNU B<parallel> advocates using B<ssh-agent>
1111 instead, but it is possible to emulate B<orgalorg>'s behavior by
1112 setting SSHPASS and by using B<--ssh "sshpass ssh">.
1114 To make the emulation easier, make a simple alias:
1116 alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --lb"
1118 If you want to supply a password run:
1120 SSHPASS=`ssh-askpass`
1122 or set the password directly:
1124 SSHPASS=P4$$w0rd!
1126 If the above is set up you can then do:
1128 orgalorg -o frontend1 -o frontend2 -p -C uptime
1129 par_emul -S frontend1 -S frontend2 uptime
1131 orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
1132 par_emul -S frontend1 -S frontend2 top -bid 1
1134 orgalorg -o frontend1 -o frontend2 -p -er /tmp -n \
1135 'md5sum /tmp/bigfile' -S bigfile
1136 par_emul -S frontend1 -S frontend2 --basefile bigfile \
1137 --workdir /tmp md5sum /tmp/bigfile
1139 B<orgalorg> has a progress indicator for the transferring of a
1140 file. GNU B<parallel> does not.
1142 https://github.com/reconquest/orgalorg
1143 (Last checked: 2016-08)
1146 =head2 DIFFERENCES BETWEEN Rust parallel(mmstick) AND GNU Parallel
1148 Rust parallel focuses on speed. It is almost as fast as B<xargs>, but
1149 not as fast as B<parallel-bash>. It implements a few features from GNU
1150 B<parallel>, but lacks many functions. All these fail:
1152 # Read arguments from file
1153 parallel -a file echo
1154 # Changing the delimiter
1155 parallel -d _ echo ::: a_b_c_
1157 These do something different from GNU B<parallel>
1159 # -q to protect quoted $ and space
1160 parallel -q perl -e '$a=shift; print "$a"x10000000' ::: a b c
1161 # Generation of combination of inputs
1162 parallel echo {1} {2} ::: red green blue ::: S M L XL XXL
1163 # {= perl expression =} replacement string
1164 parallel echo '{= s/new/old/ =}' ::: my.new your.new
1165 # --pipe
1166 seq 100000 | parallel --pipe wc
1167 # linked arguments
1168 parallel echo ::: S M L :::+ sml med lrg ::: R G B :::+ red grn blu
1169 # Run different shell dialects
1170 zsh -c 'parallel echo \={} ::: zsh && true'
1171 csh -c 'parallel echo \$\{\} ::: shell && true'
1172 bash -c 'parallel echo \$\({}\) ::: pwd && true'
1173 # Rust parallel does not start before the last argument is read
1174 (seq 10; sleep 5; echo 2) | time parallel -j2 'sleep 2; echo'
1175 tail -f /var/log/syslog | parallel echo
1177 Most of the examples from the book GNU Parallel 2018 do not work, thus
1178 Rust parallel is not close to being a compatible replacement.
1180 Rust parallel has no remote facilities.
1182 It uses /tmp/parallel for tmp files and does not clean up if
1183 terminated abruptly. If another user on the system uses Rust parallel,
1184 then /tmp/parallel will have the wrong permissions and Rust parallel
1185 will fail. A malicious user can setup the right permissions and
1186 symlink the output file to one of the user's files and next time the
1187 user uses Rust parallel it will overwrite this file.
1189 attacker$ mkdir /tmp/parallel
1190 attacker$ chmod a+rwX /tmp/parallel
1191 # Symlink to the file the attacker wants to zero out
1192 attacker$ ln -s ~victim/.important-file /tmp/parallel/stderr_1
1193 victim$ seq 1000 | parallel echo
1194 # This file is now overwritten with stderr from 'echo'
1195 victim$ cat ~victim/.important-file
1197 If /tmp/parallel runs full during the run, Rust parallel does not
1198 report this, but finishes with success - thereby risking data loss.
1200 https://github.com/mmstick/parallel
1201 (Last checked: 2016-08)
1204 =head2 DIFFERENCES BETWEEN Rush AND GNU Parallel
1206 B<rush> (https://github.com/shenwei356/rush) is written in Go and
1207 based on B<gargs>.
1209 Just like GNU B<parallel> B<rush> buffers in temporary files. But
1210 opposite GNU B<parallel> B<rush> does not clean up, if the process
1211 dies abnormally.
1213 B<rush> has some string manipulations that can be emulated by putting
1214 this into ~/.parallel/config (/ is used instead of %, and % is used
1215 instead of ^ as that is closer to bash's ${var%postfix}):
1217 --rpl '{:} s:(\.[^/]+)*$::'
1218 --rpl '{:%([^}]+?)} s:$$1(\.[^/]+)*$::'
1219 --rpl '{/:%([^}]*?)} s:.*/(.*)$$1(\.[^/]+)*$:$1:'
1220 --rpl '{/:} s:(.*/)?([^/.]+)(\.[^/]+)*$:$2:'
1221 --rpl '{@(.*?)} /$$1/ and $_=$1;'
1223 =head3 EXAMPLES FROM rush's WEBSITE
1225 Here are the examples from B<rush>'s website with the equivalent
1226 command in GNU B<parallel>.
1228 B<1. Simple run, quoting is not necessary>
1230 1$ seq 1 3 | rush echo {}
1232 1$ seq 1 3 | parallel echo {}
1234 B<2. Read data from file (`-i`)>
1236 2$ rush echo {} -i data1.txt -i data2.txt
1238 2$ cat data1.txt data2.txt | parallel echo {}
1240 B<3. Keep output order (`-k`)>
1242 3$ seq 1 3 | rush 'echo {}' -k
1244 3$ seq 1 3 | parallel -k echo {}
1247 B<4. Timeout (`-t`)>
1249 4$ time seq 1 | rush 'sleep 2; echo {}' -t 1
1251 4$ time seq 1 | parallel --timeout 1 'sleep 2; echo {}'
1253 B<5. Retry (`-r`)>
1255 5$ seq 1 | rush 'python unexisted_script.py' -r 1
1257 5$ seq 1 | parallel --retries 2 'python unexisted_script.py'
1259 Use B<-u> to see it is really run twice:
1261 5$ seq 1 | parallel -u --retries 2 'python unexisted_script.py'
1263 B<6. Dirname (`{/}`) and basename (`{%}`) and remove custom
1264 suffix (`{^suffix}`)>
1266 6$ echo dir/file_1.txt.gz | rush 'echo {/} {%} {^_1.txt.gz}'
1268 6$ echo dir/file_1.txt.gz |
1269 parallel --plus echo {//} {/} {%_1.txt.gz}
1271 B<7. Get basename, and remove last (`{.}`) or any (`{:}`) extension>
1273 7$ echo dir.d/file.txt.gz | rush 'echo {.} {:} {%.} {%:}'
1275 7$ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'
1277 B<8. Job ID, combine fields index and other replacement strings>
1279 8$ echo 12 file.txt dir/s_1.fq.gz |
1280 rush 'echo job {#}: {2} {2.} {3%:^_1}'
1282 8$ echo 12 file.txt dir/s_1.fq.gz |
1283 parallel --colsep ' ' 'echo job {#}: {2} {2.} {3/:%_1}'
1285 B<9. Capture submatch using regular expression (`{@regexp}`)>
1287 9$ echo read_1.fq.gz | rush 'echo {@(.+)_\d}'
1289 9$ echo read_1.fq.gz | parallel 'echo {@(.+)_\d}'
1291 B<10. Custom field delimiter (`-d`)>
1293 10$ echo a=b=c | rush 'echo {1} {2} {3}' -d =
1295 10$ echo a=b=c | parallel -d = echo {1} {2} {3}
1297 B<11. Send multi-lines to every command (`-n`)>
1299 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo'
1301 11$ seq 5 |
1302 parallel -n 2 -k \
1303 'echo {=-1 $_=join"\n",@arg[1..$#arg] =}; echo'
1305 11$ seq 5 | rush -n 2 -k 'echo "{}"; echo' -J ' '
1307 11$ seq 5 | parallel -n 2 -k 'echo {}; echo'
1310 B<12. Custom record delimiter (`-D`), note that empty records are not used.>
1312 12$ echo a b c d | rush -D " " -k 'echo {}'
1314 12$ echo a b c d | parallel -d " " -k 'echo {}'
1316 12$ echo abcd | rush -D "" -k 'echo {}'
1318 Cannot be done by GNU Parallel
1320 12$ cat fasta.fa
1321 >seq1
1323 >seq2
1326 >seq3
1327 attac
1331 12$ cat fasta.fa | rush -D ">" \
1332 'echo FASTA record {#}: name: {1} sequence: {2}' -k -d "\n"
1333 # rush fails to join the multiline sequences
1335 12$ cat fasta.fa | (read -n1 ignore_first_char;
1336 parallel -d '>' --colsep '\n' echo FASTA record {#}: \
1337 name: {1} sequence: '{=2 $_=join"",@arg[2..$#arg]=}'
1340 B<13. Assign value to variable, like `awk -v` (`-v`)>
1342 13$ seq 1 |
1343 rush 'echo Hello, {fname} {lname}!' -v fname=Wei -v lname=Shen
1345 13$ seq 1 |
1346 parallel -N0 \
1347 'fname=Wei; lname=Shen; echo Hello, ${fname} ${lname}!'
1349 13$ for var in a b; do \
1350 13$ seq 1 3 | rush -k -v var=$var 'echo var: {var}, data: {}'; \
1351 13$ done
1353 In GNU B<parallel> you would typically do:
1355 13$ seq 1 3 | parallel -k echo var: {1}, data: {2} ::: a b :::: -
1357 If you I<really> want the var:
1359 13$ seq 1 3 |
1360 parallel -k var={1} ';echo var: $var, data: {}' ::: a b :::: -
1362 If you I<really> want the B<for>-loop:
1364 13$ for var in a b; do
1365 export var;
1366 seq 1 3 | parallel -k 'echo var: $var, data: {}';
1367 done
1369 Contrary to B<rush> this also works if the value is complex like:
1371 My brother's 12" records
1374 B<14. Preset variable (`-v`), avoid repeatedly writing verbose replacement strings>
1376 14$ # naive way
1377 echo read_1.fq.gz | rush 'echo {:^_1} {:^_1}_2.fq.gz'
1379 14$ echo read_1.fq.gz | parallel 'echo {:%_1} {:%_1}_2.fq.gz'
1381 14$ # macro + removing suffix
1382 echo read_1.fq.gz |
1383 rush -v p='{:^_1}' 'echo {p} {p}_2.fq.gz'
1385 14$ echo read_1.fq.gz |
1386 parallel 'p={:%_1}; echo $p ${p}_2.fq.gz'
1388 14$ # macro + regular expression
1389 echo read_1.fq.gz | rush -v p='{@(.+?)_\d}' 'echo {p} {p}_2.fq.gz'
1391 14$ echo read_1.fq.gz | parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1393 Contrary to B<rush> GNU B<parallel> works with complex values:
1395 14$ echo "My brother's 12\"read_1.fq.gz" |
1396 parallel 'p={@(.+?)_\d}; echo $p ${p}_2.fq.gz'
1398 B<15. Interrupt jobs by `Ctrl-C`, rush will stop unfinished commands and exit.>
1400 15$ seq 1 20 | rush 'sleep 1; echo {}'
1403 15$ seq 1 20 | parallel 'sleep 1; echo {}'
1406 B<16. Continue/resume jobs (`-c`). When some jobs failed (by
1407 execution failure, timeout, or canceling by user with `Ctrl + C`),
1408 please switch flag `-c/--continue` on and run again, so that `rush`
1409 can save successful commands and ignore them in I<NEXT> run.>
1411 16$ seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1412 cat successful_cmds.rush
1413 seq 1 3 | rush 'sleep {}; echo {}' -t 3 -c
1415 16$ seq 1 3 | parallel --joblog mylog --timeout 2 \
1416 'sleep {}; echo {}'
1417 cat mylog
1418 seq 1 3 | parallel --joblog mylog --retry-failed \
1419 'sleep {}; echo {}'
1421 Multi-line jobs:
1423 16$ seq 1 3 | rush 'sleep {}; echo {}; \
1424 echo finish {}' -t 3 -c -C finished.rush
1425 cat finished.rush
1426 seq 1 3 | rush 'sleep {}; echo {}; \
1427 echo finish {}' -t 3 -c -C finished.rush
1429 16$ seq 1 3 |
1430 parallel --joblog mylog --timeout 2 'sleep {}; echo {}; \
1431 echo finish {}'
1432 cat mylog
1433 seq 1 3 |
1434 parallel --joblog mylog --retry-failed 'sleep {}; echo {}; \
1435 echo finish {}'
1437 B<17. A comprehensive example: downloading 1K+ pages given by
1438 three URL list files using `phantomjs save_page.js` (some page
1439 contents are dynamically generated by Javascript, so `wget` does not
1440 work). Here I set max jobs number (`-j`) as `20`, each job has a max
1441 running time (`-t`) of `60` seconds and `3` retry changes
1442 (`-r`). Continue flag `-c` is also switched on, so we can continue
1443 unfinished jobs. Luckily, it's accomplished in one run :)>
1445 17$ for f in $(seq 2014 2016); do \
1446 /bin/rm -rf $f; mkdir -p $f; \
1447 cat $f.html.txt | rush -v d=$f -d = \
1448 'phantomjs save_page.js "{}" > {d}/{3}.html' \
1449 -j 20 -t 60 -r 3 -c; \
1450 done
1452 GNU B<parallel> can append to an existing joblog with '+':
1454 17$ rm mylog
1455 for f in $(seq 2014 2016); do
1456 /bin/rm -rf $f; mkdir -p $f;
1457 cat $f.html.txt |
1458 parallel -j20 --timeout 60 --retries 4 --joblog +mylog \
1459 --colsep = \
1460 phantomjs save_page.js {1}={2}={3} '>' $f/{3}.html
1461 done
1463 B<18. A bioinformatics example: mapping with `bwa`, and
1464 processing result with `samtools`:>
1466 18$ ref=ref/xxx.fa
1467 threads=25
1468 ls -d raw.cluster.clean.mapping/* \
1469 | rush -v ref=$ref -v j=$threads -v p='{}/{%}' \
1470 'bwa mem -t {j} -M -a {ref} {p}_1.fq.gz {p}_2.fq.gz >{p}.sam;\
1471 samtools view -bS {p}.sam > {p}.bam; \
1472 samtools sort -T {p}.tmp -@ {j} {p}.bam -o {p}.sorted.bam; \
1473 samtools index {p}.sorted.bam; \
1474 samtools flagstat {p}.sorted.bam > {p}.sorted.bam.flagstat; \
1475 /bin/rm {p}.bam {p}.sam;' \
1476 -j 2 --verbose -c -C mapping.rush
1478 GNU B<parallel> would use a function:
1480 18$ ref=ref/xxx.fa
1481 export ref
1482 thr=25
1483 export thr
1484 bwa_sam() {
1485 p="$1"
1486 bam="$p".bam
1487 sam="$p".sam
1488 sortbam="$p".sorted.bam
1489 bwa mem -t $thr -M -a $ref ${p}_1.fq.gz ${p}_2.fq.gz > "$sam"
1490 samtools view -bS "$sam" > "$bam"
1491 samtools sort -T ${p}.tmp -@ $thr "$bam" -o "$sortbam"
1492 samtools index "$sortbam"
1493 samtools flagstat "$sortbam" > "$sortbam".flagstat
1494 /bin/rm "$bam" "$sam"
1496 export -f bwa_sam
1497 ls -d raw.cluster.clean.mapping/* |
1498 parallel -j 2 --verbose --joblog mylog bwa_sam
1500 =head3 Other B<rush> features
1502 B<rush> has:
1504 =over 4
1506 =item * B<awk -v> like custom defined variables (B<-v>)
1508 With GNU B<parallel> you would simply set a shell variable:
1510 parallel 'v={}; echo "$v"' ::: foo
1511 echo foo | rush -v v={} 'echo {v}'
1513 Also B<rush> does not like special chars. So these B<do not work>:
1515 echo does not work | rush -v v=\" 'echo {v}'
1516 echo "My brother's 12\" records" | rush -v v={} 'echo {v}'
1518 Whereas the corresponding GNU B<parallel> version works:
1520 parallel 'v=\"; echo "$v"' ::: works
1521 parallel 'v={}; echo "$v"' ::: "My brother's 12\" records"
1523 =item * Exit on first error(s) (-e)
1525 This is called B<--halt now,fail=1> (or shorter: B<--halt 2>) when
1526 used with GNU B<parallel>.
1528 =item * Settable records sending to every command (B<-n>, default 1)
1530 This is also called B<-n> in GNU B<parallel>.
1532 =item * Practical replacement strings
1534 =over 4
1536 =item {:} remove any extension
1538 With GNU B<parallel> this can be emulated by:
1540 parallel --plus echo '{/\..*/}' ::: foo.ext.bar.gz
1542 =item {^suffix}, remove suffix
1544 With GNU B<parallel> this can be emulated by:
1546 parallel --plus echo '{%.bar.gz}' ::: foo.ext.bar.gz
1548 =item {@regexp}, capture submatch using regular expression
1550 With GNU B<parallel> this can be emulated by:
1552 parallel --rpl '{@(.*?)} /$$1/ and $_=$1;' \
1553 echo '{@\d_(.*).gz}' ::: 1_foo.gz
1555 =item {%.}, {%:}, basename without extension
1557 With GNU B<parallel> this can be emulated by:
1559 parallel echo '{= s:.*/::;s/\..*// =}' ::: dir/foo.bar.gz
1561 And if you need it often, you define a B<--rpl> in
1562 B<$HOME/.parallel/config>:
1564 --rpl '{%.} s:.*/::;s/\..*//'
1565 --rpl '{%:} s:.*/::;s/\..*//'
1567 Then you can use them as:
1569 parallel echo {%.} {%:} ::: dir/foo.bar.gz
1571 =back
1573 =item * Preset variable (macro)
1575 E.g.
1577 echo foosuffix | rush -v p={^suffix} 'echo {p}_new_suffix'
1579 With GNU B<parallel> this can be emulated by:
1581 echo foosuffix |
1582 parallel --plus 'p={%suffix}; echo ${p}_new_suffix'
1584 Opposite B<rush> GNU B<parallel> works fine if the input contains
1585 double space, ' and ":
1587 echo "1'6\" foosuffix" |
1588 parallel --plus 'p={%suffix}; echo "${p}"_new_suffix'
1591 =item * Commands of multi-lines
1593 While you I<can> use multi-lined commands in GNU B<parallel>, to
1594 improve readability GNU B<parallel> discourages the use of multi-line
1595 commands. In most cases it can be written as a function:
1597 seq 1 3 |
1598 parallel --timeout 2 --joblog my.log 'sleep {}; echo {}; \
1599 echo finish {}'
1601 Could be written as:
1603 doit() {
1604 sleep "$1"
1605 echo "$1"
1606 echo finish "$1"
1608 export -f doit
1609 seq 1 3 | parallel --timeout 2 --joblog my.log doit
1611 The failed commands can be resumed with:
1613 seq 1 3 |
1614 parallel --resume-failed --joblog my.log 'sleep {}; echo {};\
1615 echo finish {}'
1617 =back
1619 https://github.com/shenwei356/rush
1620 (Last checked: 2017-05)
1623 =head2 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
1625 ClusterSSH solves a different problem than GNU B<parallel>.
1627 ClusterSSH opens a terminal window for each computer and using a
1628 master window you can run the same command on all the computers. This
1629 is typically used for administrating several computers that are almost
1630 identical.
1632 GNU B<parallel> runs the same (or different) commands with different
1633 arguments in parallel possibly using remote computers to help
1634 computing. If more than one computer is listed in B<-S> GNU B<parallel> may
1635 only use one of these (e.g. if there are 8 jobs to be run and one
1636 computer has 8 cores).
1638 GNU B<parallel> can be used as a poor-man's version of ClusterSSH:
1640 B<parallel --nonall -S server-a,server-b do_stuff foo bar>
1642 https://github.com/duncs/clusterssh
1643 (Last checked: 2010-12)
1646 =head2 DIFFERENCES BETWEEN coshell AND GNU Parallel
1648 B<coshell> only accepts full commands on standard input. Any quoting
1649 needs to be done by the user.
1651 Commands are run in B<sh> so any B<bash>/B<tcsh>/B<zsh> specific
1652 syntax will not work.
1654 Output can be buffered by using B<-d>. Output is buffered in memory,
1655 so big output can cause swapping and therefore be terrible slow or
1656 even cause out of memory.
1658 https://github.com/gdm85/coshell
1659 (Last checked: 2019-01)
1662 =head2 DIFFERENCES BETWEEN spread AND GNU Parallel
1664 =over
1666 =item - - - I4 - - I7
1668 =item M1 - - - - -
1670 =item O1 O2 O3 O4 O5 O6 - O8 - O10
1672 =item - - - - - - -
1674 =item - - - - - - - - -
1676 =item - -
1678 =back
1680 B<spread> runs commands on all directories. It does not run jobs in parallel.
1682 It can be emulated with GNU B<parallel> using this Bash function:
1684 spread() {
1685 _cmds() {
1686 perl -e '$"=" && ";print "@ARGV"' "cd {}" "$@"
1688 parallel $(_cmds "$@")'|| echo exit status $?' ::: */
1691 https://github.com/tfogo/spread
1692 (Last checked: 2024-04)
1695 =head2 DIFFERENCES BETWEEN pyargs AND GNU Parallel
1697 B<pyargs> deals badly with input containing spaces. It buffers stdout,
1698 but not stderr. It buffers in RAM. {} does not work as replacement
1699 string. It does not support running functions.
1701 B<pyargs> does not support composed commands if run with B<--lines>,
1702 and fails on B<pyargs traceroute gnu.org fsf.org>.
1704 =head3 Examples
1706 seq 5 | pyargs -P50 -L seq
1707 seq 5 | parallel -P50 --lb seq
1709 seq 5 | pyargs -P50 --mark -L seq
1710 seq 5 | parallel -P50 --lb \
1711 --tagstring OUTPUT'[{= $_=$job->replaced() =}]' seq
1712 # Similar, but not precisely the same
1713 seq 5 | parallel -P50 --lb --tag seq
1715 seq 5 | pyargs -P50 --mark command
1716 # Somewhat longer with GNU Parallel due to the special
1717 # --mark formatting
1718 cmd="$(echo "command" | parallel --shellquote)"
1719 wrap_cmd() {
1720 echo "MARK $cmd $@================================" >&3
1721 echo "OUTPUT START[$cmd $@]:"
1722 eval $cmd "$@"
1723 echo "OUTPUT END[$cmd $@]"
1725 (seq 5 | env_parallel -P2 wrap_cmd) 3>&1
1726 # Similar, but not exactly the same
1727 seq 5 | parallel -t --tag command
1729 (echo '1 2 3';echo 4 5 6) | pyargs --stream seq
1730 (echo '1 2 3';echo 4 5 6) | perl -pe 's/\n/ /' |
1731 parallel -r -d' ' seq
1732 # Similar, but not exactly the same
1733 parallel seq ::: 1 2 3 4 5 6
1735 https://github.com/robertblackwell/pyargs
1736 (Last checked: 2019-01)
1739 =head2 DIFFERENCES BETWEEN concurrently AND GNU Parallel
1741 B<concurrently> runs jobs in parallel.
1743 The output is prepended with the job number, and may be incomplete:
1745 $ concurrently 'seq 100000' | (sleep 3;wc -l)
1746 7165
1748 When pretty printing it caches output in memory. Output mixes by using
1749 test MIX below whether or not output is cached.
1751 There seems to be no way of making a template command and have
1752 B<concurrently> fill that with different args. The full commands must
1753 be given on the command line.
1755 There is also no way of controlling how many jobs should be run in
1756 parallel at a time - i.e. "number of jobslots". Instead all jobs are
1757 simply started in parallel.
1759 https://github.com/kimmobrunfeldt/concurrently
1760 (Last checked: 2019-01)
1763 =head2 DIFFERENCES BETWEEN map(soveran) AND GNU Parallel
1765 B<map> does not run jobs in parallel by default. The README suggests using:
1767 ... | map t 'sleep $t && say done &'
1769 But this fails if more jobs are run in parallel than the number of
1770 available processes. Since there is no support for parallelization in
1771 B<map> itself, the output also mixes:
1773 seq 10 | map i 'echo start-$i && sleep 0.$i && echo end-$i &'
1775 The major difference is that GNU B<parallel> is built for parallelization
1776 and B<map> is not. So GNU B<parallel> has lots of ways of dealing with the
1777 issues that parallelization raises:
1779 =over 4
1781 =item *
1783 Keep the number of processes manageable
1785 =item *
1787 Make sure output does not mix
1789 =item *
1791 Make Ctrl-C kill all running processes
1793 =back
1795 =head3 EXAMPLES FROM maps WEBSITE
1797 Here are the 5 examples converted to GNU Parallel:
1799 1$ ls *.c | map f 'foo $f'
1800 1$ ls *.c | parallel foo
1802 2$ ls *.c | map f 'foo $f; bar $f'
1803 2$ ls *.c | parallel 'foo {}; bar {}'
1805 3$ cat urls | map u 'curl -O $u'
1806 3$ cat urls | parallel curl -O
1808 4$ printf "1\n1\n1\n" | map t 'sleep $t && say done'
1809 4$ printf "1\n1\n1\n" | parallel 'sleep {} && say done'
1810 4$ parallel 'sleep {} && say done' ::: 1 1 1
1812 5$ printf "1\n1\n1\n" | map t 'sleep $t && say done &'
1813 5$ printf "1\n1\n1\n" | parallel -j0 'sleep {} && say done'
1814 5$ parallel -j0 'sleep {} && say done' ::: 1 1 1
1816 https://github.com/soveran/map
1817 (Last checked: 2019-01)
1820 =head2 DIFFERENCES BETWEEN loop AND GNU Parallel
1822 B<loop> mixes stdout and stderr:
1824 loop 'ls /no-such-file' >/dev/null
1826 B<loop>'s replacement string B<$ITEM> does not quote strings:
1828 echo 'two spaces' | loop 'echo $ITEM'
1830 B<loop> cannot run functions:
1832 myfunc() { echo joe; }
1833 export -f myfunc
1834 loop 'myfunc this fails'
1836 =head3 EXAMPLES FROM loop's WEBSITE
1838 Some of the examples from https://github.com/Miserlou/Loop/ can be
1839 emulated with GNU B<parallel>:
1841 # A couple of functions will make the code easier to read
1842 $ loopy() {
1843 yes | parallel -uN0 -j1 "$@"
1845 $ export -f loopy
1846 $ time_out() {
1847 parallel -uN0 -q --timeout "$@" ::: 1
1849 $ match() {
1850 perl -0777 -ne 'grep /'"$1"'/,$_ and print or exit 1'
1852 $ export -f match
1854 $ loop 'ls' --every 10s
1855 $ loopy --delay 10s ls
1857 $ loop 'touch $COUNT.txt' --count-by 5
1858 $ loopy touch '{= $_=seq()*5 =}'.txt
1860 $ loop --until-contains 200 -- \
1861 ./get_response_code.sh --site mysite.biz`
1862 $ loopy --halt now,success=1 \
1863 './get_response_code.sh --site mysite.biz | match 200'
1865 $ loop './poke_server' --for-duration 8h
1866 $ time_out 8h loopy ./poke_server
1868 $ loop './poke_server' --until-success
1869 $ loopy --halt now,success=1 ./poke_server
1871 $ cat files_to_create.txt | loop 'touch $ITEM'
1872 $ cat files_to_create.txt | parallel touch {}
1874 $ loop 'ls' --for-duration 10min --summary
1875 # --joblog is somewhat more verbose than --summary
1876 $ time_out 10m loopy --joblog my.log ./poke_server; cat my.log
1878 $ loop 'echo hello'
1879 $ loopy echo hello
1881 $ loop 'echo $COUNT'
1882 # GNU Parallel counts from 1
1883 $ loopy echo {#}
1884 # Counting from 0 can be forced
1885 $ loopy echo '{= $_=seq()-1 =}'
1887 $ loop 'echo $COUNT' --count-by 2
1888 $ loopy echo '{= $_=2*(seq()-1) =}'
1890 $ loop 'echo $COUNT' --count-by 2 --offset 10
1891 $ loopy echo '{= $_=10+2*(seq()-1) =}'
1893 $ loop 'echo $COUNT' --count-by 1.1
1894 # GNU Parallel rounds 3.3000000000000003 to 3.3
1895 $ loopy echo '{= $_=1.1*(seq()-1) =}'
1897 $ loop 'echo $COUNT $ACTUALCOUNT' --count-by 2
1898 $ loopy echo '{= $_=2*(seq()-1) =} {#}'
1900 $ loop 'echo $COUNT' --num 3 --summary
1901 # --joblog is somewhat more verbose than --summary
1902 $ seq 3 | parallel --joblog my.log echo; cat my.log
1904 $ loop 'ls -foobarbatz' --num 3 --summary
1905 # --joblog is somewhat more verbose than --summary
1906 $ seq 3 | parallel --joblog my.log -N0 ls -foobarbatz; cat my.log
1908 $ loop 'echo $COUNT' --count-by 2 --num 50 --only-last
1909 # Can be emulated by running 2 jobs
1910 $ seq 49 | parallel echo '{= $_=2*(seq()-1) =}' >/dev/null
1911 $ echo 50| parallel echo '{= $_=2*(seq()-1) =}'
1913 $ loop 'date' --every 5s
1914 $ loopy --delay 5s date
1916 $ loop 'date' --for-duration 8s --every 2s
1917 $ time_out 8s loopy --delay 2s date
1919 $ loop 'date -u' --until-time '2018-05-25 20:50:00' --every 5s
1920 $ seconds=$((`date -d 2019-05-25T20:50:00 +%s` - `date +%s`))s
1921 $ time_out $seconds loopy --delay 5s date -u
1923 $ loop 'echo $RANDOM' --until-contains "666"
1924 $ loopy --halt now,success=1 'echo $RANDOM | match 666'
1926 $ loop 'if (( RANDOM % 2 )); then
1927 (echo "TRUE"; true);
1928 else
1929 (echo "FALSE"; false);
1930 fi' --until-success
1931 $ loopy --halt now,success=1 'if (( $RANDOM % 2 )); then
1932 (echo "TRUE"; true);
1933 else
1934 (echo "FALSE"; false);
1937 $ loop 'if (( RANDOM % 2 )); then
1938 (echo "TRUE"; true);
1939 else
1940 (echo "FALSE"; false);
1941 fi' --until-error
1942 $ loopy --halt now,fail=1 'if (( $RANDOM % 2 )); then
1943 (echo "TRUE"; true);
1944 else
1945 (echo "FALSE"; false);
1948 $ loop 'date' --until-match "(\d{4})"
1949 $ loopy --halt now,success=1 'date | match [0-9][0-9][0-9][0-9]'
1951 $ loop 'echo $ITEM' --for red,green,blue
1952 $ parallel echo ::: red green blue
1954 $ cat /tmp/my-list-of-files-to-create.txt | loop 'touch $ITEM'
1955 $ cat /tmp/my-list-of-files-to-create.txt | parallel touch
1957 $ ls | loop 'cp $ITEM $ITEM.bak'; ls
1958 $ ls | parallel cp {} {}.bak; ls
1960 $ loop 'echo $ITEM | tr a-z A-Z' -i
1961 $ parallel 'echo {} | tr a-z A-Z'
1962 # Or more efficiently:
1963 $ parallel --pipe tr a-z A-Z
1965 $ loop 'echo $ITEM' --for "`ls`"
1966 $ parallel echo {} ::: "`ls`"
1968 $ ls | loop './my_program $ITEM' --until-success;
1969 $ ls | parallel --halt now,success=1 ./my_program {}
1971 $ ls | loop './my_program $ITEM' --until-fail;
1972 $ ls | parallel --halt now,fail=1 ./my_program {}
1974 $ ./deploy.sh;
1975 loop 'curl -sw "%{http_code}" http://coolwebsite.biz' \
1976 --every 5s --until-contains 200;
1977 ./announce_to_slack.sh
1978 $ ./deploy.sh;
1979 loopy --delay 5s --halt now,success=1 \
1980 'curl -sw "%{http_code}" http://coolwebsite.biz | match 200';
1981 ./announce_to_slack.sh
1983 $ loop "ping -c 1 mysite.com" --until-success; ./do_next_thing
1984 $ loopy --halt now,success=1 ping -c 1 mysite.com; ./do_next_thing
1986 $ ./create_big_file -o my_big_file.bin;
1987 loop 'ls' --until-contains 'my_big_file.bin';
1988 ./upload_big_file my_big_file.bin
1989 # inotifywait is a better tool to detect file system changes.
1990 # It can even make sure the file is complete
1991 # so you are not uploading an incomplete file
1992 $ inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f . |
1993 grep my_big_file.bin
1995 $ ls | loop 'cp $ITEM $ITEM.bak'
1996 $ ls | parallel cp {} {}.bak
1998 $ loop './do_thing.sh' --every 15s --until-success --num 5
1999 $ parallel --retries 5 --delay 15s ::: ./do_thing.sh
2001 https://github.com/Miserlou/Loop/
2002 (Last checked: 2018-10)
2005 =head2 DIFFERENCES BETWEEN lorikeet AND GNU Parallel
2007 B<lorikeet> can run jobs in parallel. It does this based on a
2008 dependency graph described in a file, so this is similar to B<make>.
2010 https://github.com/cetra3/lorikeet
2011 (Last checked: 2018-10)
2014 =head2 DIFFERENCES BETWEEN spp AND GNU Parallel
2016 B<spp> can run jobs in parallel. B<spp> does not use a command
2017 template to generate the jobs, but requires jobs to be in a
2018 file. Output from the jobs mix.
2020 https://github.com/john01dav/spp
2021 (Last checked: 2019-01)
2024 =head2 DIFFERENCES BETWEEN paral AND GNU Parallel
2026 B<paral> prints a lot of status information and stores the output from
2027 the commands run into files. This means it cannot be used the middle
2028 of a pipe like this
2030 paral "echo this" "echo does not" "echo work" | wc
2032 Instead it puts the output into files named like
2033 B<out_#_I<command>.out.log>. To get a very similar behaviour with GNU
2034 B<parallel> use B<--results
2035 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta>
2037 B<paral> only takes arguments on the command line and each argument
2038 should be a full command. Thus it does not use command templates.
2040 This limits how many jobs it can run in total, because they all need
2041 to fit on a single command line.
2043 B<paral> has no support for running jobs remotely.
2045 =head3 EXAMPLES FROM README.markdown
2047 The examples from B<README.markdown> and the corresponding command run
2048 with GNU B<parallel> (B<--results
2049 'out_{#}_{=s/[^\sa-z_0-9]//g;s/\s+/_/g=}.log' --eta> is omitted from
2050 the GNU B<parallel> command):
2052 1$ paral "command 1" "command 2 --flag" "command arg1 arg2"
2053 1$ parallel ::: "command 1" "command 2 --flag" "command arg1 arg2"
2055 2$ paral "sleep 1 && echo c1" "sleep 2 && echo c2" \
2056 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
2057 2$ parallel ::: "sleep 1 && echo c1" "sleep 2 && echo c2" \
2058 "sleep 3 && echo c3" "sleep 4 && echo c4" "sleep 5 && echo c5"
2059 # Or shorter:
2060 parallel "sleep {} && echo c{}" ::: {1..5}
2062 3$ paral -n=0 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2063 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2064 3$ parallel ::: "sleep 5 && echo c5" "sleep 4 && echo c4" \
2065 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2066 # Or shorter:
2067 parallel -j0 "sleep {} && echo c{}" ::: 5 4 3 2 1
2069 4$ paral -n=1 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2070 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2071 4$ parallel -j1 "sleep {} && echo c{}" ::: 5 4 3 2 1
2073 5$ paral -n=2 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2074 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2075 5$ parallel -j2 "sleep {} && echo c{}" ::: 5 4 3 2 1
2077 6$ paral -n=5 "sleep 5 && echo c5" "sleep 4 && echo c4" \
2078 "sleep 3 && echo c3" "sleep 2 && echo c2" "sleep 1 && echo c1"
2079 6$ parallel -j5 "sleep {} && echo c{}" ::: 5 4 3 2 1
2081 7$ paral -n=1 "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2082 echo c && sleep 0.5 && echo d && sleep 0.5 && \
2083 echo e && sleep 0.5 && echo f && sleep 0.5 && \
2084 echo g && sleep 0.5 && echo h"
2085 7$ parallel ::: "echo a && sleep 0.5 && echo b && sleep 0.5 && \
2086 echo c && sleep 0.5 && echo d && sleep 0.5 && \
2087 echo e && sleep 0.5 && echo f && sleep 0.5 && \
2088 echo g && sleep 0.5 && echo h"
2090 https://github.com/amattn/paral
2091 (Last checked: 2019-01)
2094 =head2 DIFFERENCES BETWEEN concurr AND GNU Parallel
2096 B<concurr> is built to run jobs in parallel using a client/server
2097 model.
2099 =head3 EXAMPLES FROM README.md
2101 The examples from B<README.md>:
2103 1$ concurr 'echo job {#} on slot {%}: {}' : arg1 arg2 arg3 arg4
2104 1$ parallel 'echo job {#} on slot {%}: {}' ::: arg1 arg2 arg3 arg4
2106 2$ concurr 'echo job {#} on slot {%}: {}' :: file1 file2 file3
2107 2$ parallel 'echo job {#} on slot {%}: {}' :::: file1 file2 file3
2109 3$ concurr 'echo {}' < input_file
2110 3$ parallel 'echo {}' < input_file
2112 4$ cat file | concurr 'echo {}'
2113 4$ cat file | parallel 'echo {}'
2115 B<concurr> deals badly empty input files and with output larger than
2116 64 KB.
2118 https://github.com/mmstick/concurr
2119 (Last checked: 2019-01)
2122 =head2 DIFFERENCES BETWEEN lesser-parallel AND GNU Parallel
2124 B<lesser-parallel> is the inspiration for B<parallel --embed>. Both
2125 B<lesser-parallel> and B<parallel --embed> define bash functions that
2126 can be included as part of a bash script to run jobs in parallel.
2128 B<lesser-parallel> implements a few of the replacement strings, but
2129 hardly any options, whereas B<parallel --embed> gives you the full
2130 GNU B<parallel> experience.
2132 https://github.com/kou1okada/lesser-parallel
2133 (Last checked: 2019-01)
2136 =head2 DIFFERENCES BETWEEN npm-parallel AND GNU Parallel
2138 B<npm-parallel> can run npm tasks in parallel.
2140 There are no examples and very little documentation, so it is hard to
2141 compare to GNU B<parallel>.
2143 https://github.com/spion/npm-parallel
2144 (Last checked: 2019-01)
2147 =head2 DIFFERENCES BETWEEN machma AND GNU Parallel
2149 B<machma> runs tasks in parallel. It gives time stamped
2150 output. It buffers in RAM.
2152 =head3 EXAMPLES FROM README.md
2154 The examples from README.md:
2156 1$ # Put shorthand for timestamp in config for the examples
2157 echo '--rpl '\
2158 \''{time} $_=::strftime("%Y-%m-%d %H:%M:%S",localtime())'\' \
2159 > ~/.parallel/machma
2160 echo '--line-buffer --tagstring "{#} {time} {}"' \
2161 >> ~/.parallel/machma
2163 2$ find . -iname '*.jpg' |
2164 machma -- mogrify -resize 1200x1200 -filter Lanczos {}
2165 find . -iname '*.jpg' |
2166 parallel --bar -Jmachma mogrify -resize 1200x1200 \
2167 -filter Lanczos {}
2169 3$ cat /tmp/ips | machma -p 2 -- ping -c 2 -q {}
2170 3$ cat /tmp/ips | parallel -j2 -Jmachma ping -c 2 -q {}
2172 4$ cat /tmp/ips |
2173 machma -- sh -c 'ping -c 2 -q $0 > /dev/null && echo alive' {}
2174 4$ cat /tmp/ips |
2175 parallel -Jmachma 'ping -c 2 -q {} > /dev/null && echo alive'
2177 5$ find . -iname '*.jpg' |
2178 machma --timeout 5s -- mogrify -resize 1200x1200 \
2179 -filter Lanczos {}
2180 5$ find . -iname '*.jpg' |
2181 parallel --timeout 5s --bar mogrify -resize 1200x1200 \
2182 -filter Lanczos {}
2184 6$ find . -iname '*.jpg' -print0 |
2185 machma --null -- mogrify -resize 1200x1200 -filter Lanczos {}
2186 6$ find . -iname '*.jpg' -print0 |
2187 parallel --null --bar mogrify -resize 1200x1200 \
2188 -filter Lanczos {}
2190 https://github.com/fd0/machma
2191 (Last checked: 2019-06)
2194 =head2 DIFFERENCES BETWEEN interlace AND GNU Parallel
2196 Summary (see legend above):
2198 =over
2200 =item - I2 I3 I4 - - -
2202 =item M1 - M3 - - M6
2204 =item - O2 O3 - - - - x x
2206 =item E1 E2 - - - - -
2208 =item - - - - - - - - -
2210 =item - -
2212 =back
2214 B<interlace> is built for network analysis to run network tools in parallel.
2216 B<interface> does not buffer output, so output from different jobs mixes.
2218 The overhead for each target is O(n*n), so with 1000 targets it
2219 becomes very slow with an overhead in the order of 500ms/target.
2221 =head3 EXAMPLES FROM interlace's WEBSITE
2223 Using B<prips> most of the examples from
2224 https://github.com/codingo/Interlace can be run with GNU B<parallel>:
2226 Blocker
2228 commands.txt:
2229 mkdir -p _output_/_target_/scans/
2230 _blocker_
2231 nmap _target_ -oA _output_/_target_/scans/_target_-nmap
2232 interlace -tL ./targets.txt -cL commands.txt -o $output
2234 parallel -a targets.txt \
2235 mkdir -p $output/{}/scans/\; nmap {} -oA $output/{}/scans/{}-nmap
2237 Blocks
2239 commands.txt:
2240 _block:nmap_
2241 mkdir -p _target_/output/scans/
2242 nmap _target_ -oN _target_/output/scans/_target_-nmap
2243 _block:nmap_
2244 nikto --host _target_
2245 interlace -tL ./targets.txt -cL commands.txt
2247 _nmap() {
2248 mkdir -p $1/output/scans/
2249 nmap $1 -oN $1/output/scans/$1-nmap
2251 export -f _nmap
2252 parallel ::: _nmap "nikto --host" :::: targets.txt
2254 Run Nikto Over Multiple Sites
2256 interlace -tL ./targets.txt -threads 5 \
2257 -c "nikto --host _target_ > ./_target_-nikto.txt" -v
2259 parallel -a targets.txt -P5 nikto --host {} \> ./{}_-nikto.txt
2261 Run Nikto Over Multiple Sites and Ports
2263 interlace -tL ./targets.txt -threads 5 -c \
2264 "nikto --host _target_:_port_ > ./_target_-_port_-nikto.txt" \
2265 -p 80,443 -v
2267 parallel -P5 nikto --host {1}:{2} \> ./{1}-{2}-nikto.txt \
2268 :::: targets.txt ::: 80 443
2270 Run a List of Commands against Target Hosts
2272 commands.txt:
2273 nikto --host _target_:_port_ > _output_/_target_-nikto.txt
2274 sslscan _target_:_port_ > _output_/_target_-sslscan.txt
2275 testssl.sh _target_:_port_ > _output_/_target_-testssl.txt
2276 interlace -t example.com -o ~/Engagements/example/ \
2277 -cL ./commands.txt -p 80,443
2279 parallel --results ~/Engagements/example/{2}:{3}{1} {1} {2}:{3} \
2280 ::: "nikto --host" sslscan testssl.sh ::: example.com ::: 80 443
2282 CIDR notation with an application that doesn't support it
2284 interlace -t 192.168.12.0/24 -c "vhostscan _target_ \
2285 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2287 prips 192.168.12.0/24 |
2288 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2290 Glob notation with an application that doesn't support it
2292 interlace -t 192.168.12.* -c "vhostscan _target_ \
2293 -oN _output_/_target_-vhosts.txt" -o ~/scans/ -threads 50
2295 # Glob is not supported in prips
2296 prips 192.168.12.0/24 |
2297 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2299 Dash (-) notation with an application that doesn't support it
2301 interlace -t 192.168.12.1-15 -c \
2302 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2303 -o ~/scans/ -threads 50
2305 # Dash notation is not supported in prips
2306 prips 192.168.12.1 192.168.12.15 |
2307 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2309 Threading Support for an application that doesn't support it
2311 interlace -tL ./target-list.txt -c \
2312 "vhostscan -t _target_ -oN _output_/_target_-vhosts.txt" \
2313 -o ~/scans/ -threads 50
2315 cat ./target-list.txt |
2316 parallel -P50 vhostscan -t {} -oN ~/scans/{}-vhosts.txt
2318 alternatively
2320 ./vhosts-commands.txt:
2321 vhostscan -t $target -oN _output_/_target_-vhosts.txt
2322 interlace -cL ./vhosts-commands.txt -tL ./target-list.txt \
2323 -threads 50 -o ~/scans
2325 ./vhosts-commands.txt:
2326 vhostscan -t "$1" -oN "$2"
2327 parallel -P50 ./vhosts-commands.txt {} ~/scans/{}-vhosts.txt \
2328 :::: ./target-list.txt
2330 Exclusions
2332 interlace -t 192.168.12.0/24 -e 192.168.12.0/26 -c \
2333 "vhostscan _target_ -oN _output_/_target_-vhosts.txt" \
2334 -o ~/scans/ -threads 50
2336 prips 192.168.12.0/24 | grep -xv -Ff <(prips 192.168.12.0/26) |
2337 parallel -P50 vhostscan {} -oN ~/scans/{}-vhosts.txt
2339 Run Nikto Using Multiple Proxies
2341 interlace -tL ./targets.txt -pL ./proxies.txt -threads 5 -c \
2342 "nikto --host _target_:_port_ -useproxy _proxy_ > \
2343 ./_target_-_port_-nikto.txt" -p 80,443 -v
2345 parallel -j5 \
2346 "nikto --host {1}:{2} -useproxy {3} > ./{1}-{2}-nikto.txt" \
2347 :::: ./targets.txt ::: 80 443 :::: ./proxies.txt
2349 https://github.com/codingo/Interlace
2350 (Last checked: 2019-09)
2353 =head2 DIFFERENCES BETWEEN otonvm Parallel AND GNU Parallel
2355 I have been unable to get the code to run at all. It seems unfinished.
2357 https://github.com/otonvm/Parallel
2358 (Last checked: 2019-02)
2361 =head2 DIFFERENCES BETWEEN k-bx par AND GNU Parallel
2363 B<par> requires Haskell to work. This limits the number of platforms
2364 this can work on.
2366 B<par> does line buffering in memory. The memory usage is 3x the
2367 longest line (compared to 1x for B<parallel --lb>). Commands must be
2368 given as arguments. There is no template.
2370 These are the examples from https://github.com/k-bx/par with the
2371 corresponding GNU B<parallel> command.
2373 par "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2374 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2375 parallel --lb ::: "echo foo; sleep 1; echo foo; sleep 1; echo foo" \
2376 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2378 par "echo foo; sleep 1; foofoo" \
2379 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2380 parallel --lb --halt 1 ::: "echo foo; sleep 1; foofoo" \
2381 "echo bar; sleep 1; echo bar; sleep 1; echo bar" && echo "success"
2383 par "PARPREFIX=[fooechoer] echo foo" "PARPREFIX=[bar] echo bar"
2384 parallel --lb --colsep , --tagstring {1} {2} \
2385 ::: "[fooechoer],echo foo" "[bar],echo bar"
2387 par --succeed "foo" "bar" && echo 'wow'
2388 parallel "foo" "bar"; true && echo 'wow'
2390 https://github.com/k-bx/par
2391 (Last checked: 2019-02)
2393 =head2 DIFFERENCES BETWEEN parallelshell AND GNU Parallel
2395 B<parallelshell> does not allow for composed commands:
2397 # This does not work
2398 parallelshell 'echo foo;echo bar' 'echo baz;echo quuz'
2400 Instead you have to wrap that in a shell:
2402 parallelshell 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2404 It buffers output in RAM. All commands must be given on the command
2405 line and all commands are started in parallel at the same time. This
2406 will cause the system to freeze if there are so many jobs that there
2407 is not enough memory to run them all at the same time.
2409 https://github.com/keithamus/parallelshell
2410 (Last checked: 2019-02)
2412 https://github.com/darkguy2008/parallelshell
2413 (Last checked: 2019-03)
2416 =head2 DIFFERENCES BETWEEN shell-executor AND GNU Parallel
2418 B<shell-executor> does not allow for composed commands:
2420 # This does not work
2421 sx 'echo foo;echo bar' 'echo baz;echo quuz'
2423 Instead you have to wrap that in a shell:
2425 sx 'sh -c "echo foo;echo bar"' 'sh -c "echo baz;echo quuz"'
2427 It buffers output in RAM. All commands must be given on the command
2428 line and all commands are started in parallel at the same time. This
2429 will cause the system to freeze if there are so many jobs that there
2430 is not enough memory to run them all at the same time.
2432 https://github.com/royriojas/shell-executor
2433 (Last checked: 2019-02)
2436 =head2 DIFFERENCES BETWEEN non-GNU par AND GNU Parallel
2438 B<par> buffers in memory to avoid mixing of jobs. It takes 1s per 1
2439 million output lines.
2441 B<par> needs to have all commands before starting the first job. The
2442 jobs are read from stdin (standard input) so any quoting will have to
2443 be done by the user.
2445 Stdout (standard output) is prepended with o:. Stderr (standard error)
2446 is sendt to stdout (standard output) and prepended with e:.
2448 For short jobs with little output B<par> is 20% faster than GNU
2449 B<parallel> and 60% slower than B<xargs>.
2451 https://github.com/UnixJunkie/PAR
2453 https://savannah.nongnu.org/projects/par
2454 (Last checked: 2019-02)
2457 =head2 DIFFERENCES BETWEEN fd AND GNU Parallel
2459 B<fd> does not support composed commands, so commands must be wrapped
2460 in B<sh -c>.
2462 It buffers output in RAM.
2464 It only takes file names from the filesystem as input (similar to B<find>).
2466 https://github.com/sharkdp/fd
2467 (Last checked: 2019-02)
2470 =head2 DIFFERENCES BETWEEN lateral AND GNU Parallel
2472 B<lateral> is very similar to B<sem>: It takes a single command and
2473 runs it in the background. The design means that output from parallel
2474 running jobs may mix. If it dies unexpectly it leaves a socket in
2475 ~/.lateral/socket.PID.
2477 B<lateral> deals badly with too long command lines. This makes the
2478 B<lateral> server crash:
2480 lateral run echo `seq 100000| head -c 1000k`
2482 Any options will be read by B<lateral> so this does not work
2483 (B<lateral> interprets the B<-l>):
2485 lateral run ls -l
2487 Composed commands do not work:
2489 lateral run pwd ';' ls
2491 Functions do not work:
2493 myfunc() { echo a; }
2494 export -f myfunc
2495 lateral run myfunc
2497 Running B<emacs> in the terminal causes the parent shell to die:
2499 echo '#!/bin/bash' > mycmd
2500 echo emacs -nw >> mycmd
2501 chmod +x mycmd
2502 lateral start
2503 lateral run ./mycmd
2505 Here are the examples from https://github.com/akramer/lateral with the
2506 corresponding GNU B<sem> and GNU B<parallel> commands:
2508 1$ lateral start
2509 for i in $(cat /tmp/names); do
2510 lateral run -- some_command $i
2511 done
2512 lateral wait
2514 1$ for i in $(cat /tmp/names); do
2515 sem some_command $i
2516 done
2517 sem --wait
2519 1$ parallel some_command :::: /tmp/names
2521 2$ lateral start
2522 for i in $(seq 1 100); do
2523 lateral run -- my_slow_command < workfile$i > /tmp/logfile$i
2524 done
2525 lateral wait
2527 2$ for i in $(seq 1 100); do
2528 sem my_slow_command < workfile$i > /tmp/logfile$i
2529 done
2530 sem --wait
2532 2$ parallel 'my_slow_command < workfile{} > /tmp/logfile{}' \
2533 ::: {1..100}
2535 3$ lateral start -p 0 # yup, it will just queue tasks
2536 for i in $(seq 1 100); do
2537 lateral run -- command_still_outputs_but_wont_spam inputfile$i
2538 done
2539 # command output spam can commence
2540 lateral config -p 10; lateral wait
2542 3$ for i in $(seq 1 100); do
2543 echo "command inputfile$i" >> joblist
2544 done
2545 parallel -j 10 :::: joblist
2547 3$ echo 1 > /tmp/njobs
2548 parallel -j /tmp/njobs command inputfile{} \
2549 ::: {1..100} &
2550 echo 10 >/tmp/njobs
2551 wait
2553 https://github.com/akramer/lateral
2554 (Last checked: 2019-03)
2557 =head2 DIFFERENCES BETWEEN with-this AND GNU Parallel
2559 The examples from https://github.com/amritb/with-this.git and the
2560 corresponding GNU B<parallel> command:
2562 with -v "$(cat myurls.txt)" "curl -L this"
2563 parallel curl -L ::: myurls.txt
2565 with -v "$(cat myregions.txt)" \
2566 "aws --region=this ec2 describe-instance-status"
2567 parallel aws --region={} ec2 describe-instance-status \
2568 :::: myregions.txt
2570 with -v "$(ls)" "kubectl --kubeconfig=this get pods"
2571 ls | parallel kubectl --kubeconfig={} get pods
2573 with -v "$(ls | grep config)" "kubectl --kubeconfig=this get pods"
2574 ls | grep config | parallel kubectl --kubeconfig={} get pods
2576 with -v "$(echo {1..10})" "echo 123"
2577 parallel -N0 echo 123 ::: {1..10}
2579 Stderr is merged with stdout. B<with-this> buffers in RAM. It uses 3x
2580 the output size, so you cannot have output larger than 1/3rd the
2581 amount of RAM. The input values cannot contain spaces. Composed
2582 commands do not work.
2584 B<with-this> gives some additional information, so the output has to
2585 be cleaned before piping it to the next command.
2587 https://github.com/amritb/with-this.git
2588 (Last checked: 2019-03)
2591 =head2 DIFFERENCES BETWEEN Tollef's parallel (moreutils) AND GNU Parallel
2593 Summary (see legend above):
2595 =over
2597 =item - - - I4 - - I7
2599 =item - - M3 - - M6
2601 =item - O2 O3 - O5 O6 - x x
2603 =item E1 - - - - - E7
2605 =item - x x x x x x x x
2607 =item - -
2609 =back
2611 =head3 EXAMPLES FROM Tollef's parallel MANUAL
2613 B<Tollef> parallel sh -c "echo hi; sleep 2; echo bye" -- 1 2 3
2615 B<GNU> parallel "echo hi; sleep 2; echo bye" ::: 1 2 3
2617 B<Tollef> parallel -j 3 ufraw -o processed -- *.NEF
2619 B<GNU> parallel -j 3 ufraw -o processed ::: *.NEF
2621 B<Tollef> parallel -j 3 -- ls df "echo hi"
2623 B<GNU> parallel -j 3 ::: ls df "echo hi"
2625 (Last checked: 2019-08)
2627 =head2 DIFFERENCES BETWEEN rargs AND GNU Parallel
2629 Summary (see legend above):
2631 =over
2633 =item I1 - - - - - I7
2635 =item - - M3 M4 - -
2637 =item - O2 O3 - O5 O6 - O8 -
2639 =item E1 - - E4 - - -
2641 =item - - - - - - - - -
2643 =item - -
2645 =back
2647 B<rargs> has elegant ways of doing named regexp capture and field ranges.
2649 With GNU B<parallel> you can use B<--rpl> to get a similar
2650 functionality as regexp capture gives, and use B<join> and B<@arg> to
2651 get the field ranges. But the syntax is longer. This:
2653 --rpl '{r(\d+)\.\.(\d+)} $_=join"$opt::colsep",@arg[$$1..$$2]'
2655 would make it possible to use:
2657 {1r3..6}
2659 for field 3..6.
2661 For full support of {n..m:s} including negative numbers use a dynamic
2662 replacement string like this:
2665 PARALLEL=--rpl\ \''{r((-?\d+)?)\.\.((-?\d+)?)((:([^}]*))?)}
2666 $a = defined $$2 ? $$2 < 0 ? 1+$#arg+$$2 : $$2 : 1;
2667 $b = defined $$4 ? $$4 < 0 ? 1+$#arg+$$4 : $$4 : $#arg+1;
2668 $s = defined $$6 ? $$7 : " ";
2669 $_ = join $s,@arg[$a..$b]'\'
2670 export PARALLEL
2672 You can then do:
2674 head /etc/passwd | parallel --colsep : echo ..={1r..} ..3={1r..3} \
2675 4..={1r4..} 2..4={1r2..4} 3..3={1r3..3} ..3:-={1r..3:-} \
2676 ..3:/={1r..3:/} -1={-1} -5={-5} -6={-6} -3..={1r-3..}
2678 =head3 EXAMPLES FROM rargs MANUAL
2680 1$ ls *.bak | rargs -p '(.*)\.bak' mv {0} {1}
2682 1$ ls *.bak | parallel mv {} {.}
2684 2$ cat download-list.csv |
2685 rargs -p '(?P<url>.*),(?P<filename>.*)' wget {url} -O {filename}
2687 2$ cat download-list.csv |
2688 parallel --csv wget {1} -O {2}
2689 # or use regexps:
2690 2$ cat download-list.csv |
2691 parallel --rpl '{url} s/,.*//' --rpl '{filename} s/.*?,//' \
2692 wget {url} -O {filename}
2694 3$ cat /etc/passwd |
2695 rargs -d: echo -e 'id: "{1}"\t name: "{5}"\t rest: "{6..::}"'
2697 3$ cat /etc/passwd |
2698 parallel -q --colsep : \
2699 echo -e 'id: "{1}"\t name: "{5}"\t rest: "{=6 $_=join":",@arg[6..$#arg]=}"'
2701 https://github.com/lotabout/rargs
2702 (Last checked: 2020-01)
2705 =head2 DIFFERENCES BETWEEN threader AND GNU Parallel
2707 Summary (see legend above):
2709 =over
2711 =item I1 - - - - - -
2713 =item M1 - M3 - - M6
2715 =item O1 - O3 - O5 - - x x
2717 =item E1 - - E4 - - -
2719 =item - - - - - - - - -
2721 =item - -
2723 =back
2725 Newline separates arguments, but newline at the end of file is treated
2726 as an empty argument. So this runs 2 jobs:
2728 echo two_jobs | threader -run 'echo "$THREADID"'
2730 B<threader> ignores stderr, so any output to stderr is
2731 lost. B<threader> buffers in RAM, so output bigger than the machine's
2732 virtual memory will cause the machine to crash.
2734 https://github.com/voodooEntity/threader
2735 (Last checked: 2020-04)
2738 =head2 DIFFERENCES BETWEEN runp AND GNU Parallel
2740 Summary (see legend above):
2742 =over
2744 =item I1 I2 - - - - -
2746 =item M1 - (M3) - - M6
2748 =item O1 O2 O3 - O5 O6 - x x -
2750 =item E1 - - - - - -
2752 =item - - - - - - - - -
2754 =item - -
2756 =back
2758 (M3): You can add a prefix and a postfix to the input, so it means you can
2759 only insert the argument on the command line once.
2761 B<runp> runs 10 jobs in parallel by default. B<runp> blocks if output
2762 of a command is > 64 Kbytes. Quoting of input is needed. It adds
2763 output to stderr (this can be prevented with -q)
2765 =head3 Examples as GNU Parallel
2767 base='https://images-api.nasa.gov/search'
2768 query='jupiter'
2769 desc='planet'
2770 type='image'
2771 url="$base?q=$query&description=$desc&media_type=$type"
2773 # Download the images in parallel using runp
2774 curl -s $url | jq -r .collection.items[].href | \
2775 runp -p 'curl -s' | jq -r .[] | grep large | \
2776 runp -p 'curl -s -L -O'
2778 time curl -s $url | jq -r .collection.items[].href | \
2779 runp -g 1 -q -p 'curl -s' | jq -r .[] | grep large | \
2780 runp -g 1 -q -p 'curl -s -L -O'
2782 # Download the images in parallel
2783 curl -s $url | jq -r .collection.items[].href | \
2784 parallel curl -s | jq -r .[] | grep large | \
2785 parallel curl -s -L -O
2787 time curl -s $url | jq -r .collection.items[].href | \
2788 parallel -j 1 curl -s | jq -r .[] | grep large | \
2789 parallel -j 1 curl -s -L -O
2792 =head4 Run some test commands (read from file)
2794 # Create a file containing commands to run in parallel.
2795 cat << EOF > /tmp/test-commands.txt
2796 sleep 5
2797 sleep 3
2798 blah # this will fail
2799 ls $PWD # PWD shell variable is used here
2802 # Run commands from the file.
2803 runp /tmp/test-commands.txt > /dev/null
2805 parallel -a /tmp/test-commands.txt > /dev/null
2807 =head4 Ping several hosts and see packet loss (read from stdin)
2809 # First copy this line and press Enter
2810 runp -p 'ping -c 5 -W 2' -s '| grep loss'
2811 localhost
2812 1.1.1.1
2813 8.8.8.8
2814 # Press Enter and Ctrl-D when done entering the hosts
2816 # First copy this line and press Enter
2817 parallel ping -c 5 -W 2 {} '| grep loss'
2818 localhost
2819 1.1.1.1
2820 8.8.8.8
2821 # Press Enter and Ctrl-D when done entering the hosts
2823 =head4 Get directories' sizes (read from stdin)
2825 echo -e "$HOME\n/etc\n/tmp" | runp -q -p 'sudo du -sh'
2827 echo -e "$HOME\n/etc\n/tmp" | parallel sudo du -sh
2828 # or:
2829 parallel sudo du -sh ::: "$HOME" /etc /tmp
2831 =head4 Compress files
2833 find . -iname '*.txt' | runp -p 'gzip --best'
2835 find . -iname '*.txt' | parallel gzip --best
2837 =head4 Measure HTTP request + response time
2839 export CURL="curl -w 'time_total: %{time_total}\n'"
2840 CURL="$CURL -o /dev/null -s https://golang.org/"
2841 perl -wE 'for (1..10) { say $ENV{CURL} }' |
2842 runp -q # Make 10 requests
2844 perl -wE 'for (1..10) { say $ENV{CURL} }' | parallel
2845 # or:
2846 parallel -N0 "$CURL" ::: {1..10}
2848 =head4 Find open TCP ports
2850 cat << EOF > /tmp/host-port.txt
2851 localhost 22
2852 localhost 80
2853 localhost 81
2854 127.0.0.1 443
2855 127.0.0.1 444
2856 scanme.nmap.org 22
2857 scanme.nmap.org 23
2858 scanme.nmap.org 443
2861 1$ cat /tmp/host-port.txt |
2862 runp -q -p 'netcat -v -w2 -z' 2>&1 | egrep '(succeeded!|open)$'
2864 # --colsep is needed to split the line
2865 1$ cat /tmp/host-port.txt |
2866 parallel --colsep ' ' netcat -v -w2 -z 2>&1 |
2867 egrep '(succeeded!|open)$'
2868 # or use uq for unquoted:
2869 1$ cat /tmp/host-port.txt |
2870 parallel netcat -v -w2 -z {=uq=} 2>&1 |
2871 egrep '(succeeded!|open)$'
2873 https://github.com/jreisinger/runp
2874 (Last checked: 2020-04)
2877 =head2 DIFFERENCES BETWEEN papply AND GNU Parallel
2879 Summary (see legend above):
2881 =over
2883 =item - - - I4 - - -
2885 =item M1 - M3 - - M6
2887 =item - - O3 - O5 - - x x O10
2889 =item E1 - - E4 - - -
2891 =item - - - - - - - - -
2893 =item - -
2895 =back
2897 B<papply> does not print the output if the command fails:
2899 $ papply 'echo %F; false' foo
2900 "echo foo; false" did not succeed
2902 B<papply>'s replacement strings (%F %d %f %n %e %z) can be simulated in GNU
2903 B<parallel> by putting this in B<~/.parallel/config>:
2905 --rpl '%F'
2906 --rpl '%d $_=Q(::dirname($_));'
2907 --rpl '%f s:.*/::;'
2908 --rpl '%n s:.*/::;s:\.[^/.]+$::;'
2909 --rpl '%e s:.*\.:.:'
2910 --rpl '%z $_=""'
2912 B<papply> buffers in RAM, and uses twice the amount of output. So
2913 output of 5 GB takes 10 GB RAM.
2915 The buffering is very CPU intensive: Buffering a line of 5 GB takes 40
2916 seconds (compared to 10 seconds with GNU B<parallel>).
2919 =head3 Examples as GNU Parallel
2921 1$ papply gzip *.txt
2923 1$ parallel gzip ::: *.txt
2925 2$ papply "convert %F %n.jpg" *.png
2927 2$ parallel convert {} {.}.jpg ::: *.png
2930 https://pypi.org/project/papply/
2931 (Last checked: 2020-04)
2934 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
2936 Summary (see legend above):
2938 =over
2940 =item - - - I4 - - I7
2942 =item - - - - - M6
2944 =item - O2 O3 - O5 O6 - x x O10
2946 =item E1 - - E4 - E6 -
2948 =item - - - - - - - - -
2950 =item S1 S2
2952 =back
2954 B<async> is very similary to GNU B<parallel>'s B<--semaphore> mode
2955 (aka B<sem>). B<async> requires the user to start a server process.
2957 The input is quoted like B<-q> so you need B<bash -c "...;..."> to run
2958 composed commands.
2960 =head3 Examples as GNU Parallel
2962 1$ S="/tmp/example_socket"
2964 1$ ID=myid
2966 2$ async -s="$S" server --start
2968 2$ # GNU Parallel does not need a server to run
2970 3$ for i in {1..20}; do
2971 # prints command output to stdout
2972 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
2973 done
2975 3$ for i in {1..20}; do
2976 # prints command output to stdout
2977 sem --id "$ID" -j100% "sleep 1 && echo test $i"
2978 # GNU Parallel will only print job when it is done
2979 # If you need output from different jobs to mix
2980 # use -u or --line-buffer
2981 sem --id "$ID" -j100% --line-buffer "sleep 1 && echo test $i"
2982 done
2984 4$ # wait until all commands are finished
2985 async -s="$S" wait
2987 4$ sem --id "$ID" --wait
2989 5$ # configure the server to run four commands in parallel
2990 async -s="$S" server -j4
2992 5$ export PARALLEL=-j4
2994 6$ mkdir "/tmp/ex_dir"
2995 for i in {21..40}; do
2996 # redirects command output to /tmp/ex_dir/file*
2997 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
2998 bash -c "sleep 1 && echo test $i"
2999 done
3001 6$ mkdir "/tmp/ex_dir"
3002 for i in {21..40}; do
3003 # redirects command output to /tmp/ex_dir/file*
3004 sem --id "$ID" --result '/tmp/my-ex/file-{=$_=""=}'"$i" \
3005 "sleep 1 && echo test $i"
3006 done
3008 7$ sem --id "$ID" --wait
3010 7$ async -s="$S" wait
3012 8$ # stops server
3013 async -s="$S" server --stop
3015 8$ # GNU Parallel does not need to stop a server
3018 https://github.com/ctbur/async/
3019 (Last checked: 2023-01)
3022 =head2 DIFFERENCES BETWEEN pardi AND GNU Parallel
3024 Summary (see legend above):
3026 =over
3028 =item I1 I2 - - - - I7
3030 =item M1 - - - - M6
3032 =item O1 O2 O3 O4 O5 - O7 - - O10
3034 =item E1 - - E4 - - -
3036 =item - - - - - - - - -
3038 =item - -
3040 =back
3042 B<pardi> is very similar to B<parallel --pipe --cat>: It reads blocks
3043 of data and not arguments. So it cannot insert an argument in the
3044 command line. It puts the block into a temporary file, and this file
3045 name (%IN) can be put in the command line. You can only use %IN once.
3047 It can also run full command lines in parallel (like: B<cat file |
3048 parallel>).
3050 =head3 EXAMPLES FROM pardi test.sh
3052 1$ time pardi -v -c 100 -i data/decoys.smi -ie .smi -oe .smi \
3053 -o data/decoys_std_pardi.smi \
3054 -w '(standardiser -i %IN -o %OUT 2>&1) > /dev/null'
3056 1$ cat data/decoys.smi |
3057 time parallel -N 100 --pipe --cat \
3058 '(standardiser -i {} -o {#} 2>&1) > /dev/null; cat {#}; rm {#}' \
3059 > data/decoys_std_pardi.smi
3061 2$ pardi -n 1 -i data/test_in.types -o data/test_out.types \
3062 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3064 2$ cat data/test_in.types |
3065 parallel -n 1 -k --pipe --cat --regexp --recstart '^#atoms' \
3066 'cat {}' > data/test_out.types
3068 3$ pardi -c 6 -i data/test_in.types -o data/test_out.types \
3069 -d 'r:^#atoms:' -w 'cat %IN > %OUT'
3071 3$ cat data/test_in.types |
3072 parallel -n 6 -k --pipe --cat --regexp --recstart '^#atoms' \
3073 'cat {}' > data/test_out.types
3075 4$ pardi -i data/decoys.mol2 -o data/still_decoys.mol2 \
3076 -d 's:@<TRIPOS>MOLECULE' -w 'cp %IN %OUT'
3078 4$ cat data/decoys.mol2 |
3079 parallel -n 1 --pipe --cat --recstart '@<TRIPOS>MOLECULE' \
3080 'cp {} {#}; cat {#}; rm {#}' > data/still_decoys.mol2
3082 5$ pardi -i data/decoys.mol2 -o data/decoys2.mol2 \
3083 -d b:10000 -w 'cp %IN %OUT' --preserve
3085 5$ cat data/decoys.mol2 |
3086 parallel -k --pipe --block 10k --recend '' --cat \
3087 'cat {} > {#}; cat {#}; rm {#}' > data/decoys2.mol2
3089 https://github.com/UnixJunkie/pardi
3090 (Last checked: 2021-01)
3093 =head2 DIFFERENCES BETWEEN bthread AND GNU Parallel
3095 Summary (see legend above):
3097 =over
3099 =item - - - I4 - - -
3101 =item - - - - - M6
3103 =item O1 - O3 - - - O7 O8 - -
3105 =item E1 - - - - - -
3107 =item - - - - - - - - -
3109 =item - -
3111 =back
3113 B<bthread> takes around 1 sec per MB of output. The maximal output
3114 line length is 1073741759.
3116 You cannot quote space in the command, so you cannot run composed
3117 commands like B<sh -c "echo a; echo b">.
3119 https://gitlab.com/netikras/bthread
3120 (Last checked: 2021-01)
3123 =head2 DIFFERENCES BETWEEN simple_gpu_scheduler AND GNU Parallel
3125 Summary (see legend above):
3127 =over
3129 =item I1 - - - - - I7
3131 =item M1 - - - - M6
3133 =item - O2 O3 - - O6 - x x O10
3135 =item E1 - - - - - -
3137 =item - - - - - - - - -
3139 =item - -
3141 =back
3143 =head3 EXAMPLES FROM simple_gpu_scheduler MANUAL
3145 1$ simple_gpu_scheduler --gpus 0 1 2 < gpu_commands.txt
3147 1$ parallel -j3 --shuf \
3148 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' \
3149 < gpu_commands.txt
3151 2$ simple_hypersearch \
3152 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3153 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3154 simple_gpu_scheduler --gpus 0,1,2
3156 2$ parallel --header : --shuf -j3 -v \
3157 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =}' \
3158 python3 train_dnn.py --lr {lr} --batch_size {bs} \
3159 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3161 3$ simple_hypersearch \
3162 "python3 train_dnn.py --lr {lr} --batch_size {bs}" \
3163 --n-samples 5 -p lr 0.001 0.0005 0.0001 -p bs 32 64 128 |
3164 simple_gpu_scheduler --gpus 0,1,2
3166 3$ parallel --header : --shuf \
3167 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1; seq()>5 and skip() =}' \
3168 python3 train_dnn.py --lr {lr} --batch_size {bs} \
3169 ::: lr 0.001 0.0005 0.0001 ::: bs 32 64 128
3171 4$ touch gpu.queue
3172 tail -f -n 0 gpu.queue | simple_gpu_scheduler --gpus 0,1,2 &
3173 echo "my_command_with | and stuff > logfile" >> gpu.queue
3175 4$ touch gpu.queue
3176 tail -f -n 0 gpu.queue |
3177 parallel -j3 CUDA_VISIBLE_DEVICES='{=1 $_=slot()-1 =} {=uq;=}' &
3178 # Needed to fill job slots once
3179 seq 3 | parallel echo true >> gpu.queue
3180 # Add jobs
3181 echo "my_command_with | and stuff > logfile" >> gpu.queue
3182 # Needed to flush output from completed jobs
3183 seq 3 | parallel echo true >> gpu.queue
3185 https://github.com/ExpectationMax/simple_gpu_scheduler
3186 (Last checked: 2021-01)
3189 =head2 DIFFERENCES BETWEEN parasweep AND GNU Parallel
3191 B<parasweep> is a Python module for facilitating parallel parameter
3192 sweeps.
3194 A B<parasweep> job will normally take a text file as input. The text
3195 file contains arguments for the job. Some of these arguments will be
3196 fixed and some of them will be changed by B<parasweep>.
3198 It does this by having a template file such as template.txt:
3200 Xval: {x}
3201 Yval: {y}
3202 FixedValue: 9
3203 # x with 2 decimals
3204 DecimalX: {x:.2f}
3205 TenX: ${x*10}
3206 RandomVal: {r}
3208 and from this template it generates the file to be used by the job by
3209 replacing the replacement strings.
3211 Being a Python module B<parasweep> integrates tighter with Python than
3212 GNU B<parallel>. You get the parameters directly in a Python data
3213 structure. With GNU B<parallel> you can use the JSON or CSV output
3214 format to get something similar, but you would have to read the
3215 output.
3217 B<parasweep> has a filtering method to ignore parameter combinations
3218 you do not need.
3220 Instead of calling the jobs directly, B<parasweep> can use Python's
3221 Distributed Resource Management Application API to make jobs run with
3222 different cluster software.
3225 GNU B<parallel> B<--tmpl> supports templates with replacement
3226 strings. Such as:
3228 Xval: {x}
3229 Yval: {y}
3230 FixedValue: 9
3231 # x with 2 decimals
3232 DecimalX: {=x $_=sprintf("%.2f",$_) =}
3233 TenX: {=x $_=$_*10 =}
3234 RandomVal: {=1 $_=rand() =}
3236 that can be used like:
3238 parallel --header : --tmpl my.tmpl={#}.t myprog {#}.t \
3239 ::: x 1 2 3 ::: y 1 2 3
3241 Filtering is supported as:
3243 parallel --filter '{1} > {2}' echo ::: 1 2 3 ::: 1 2 3
3245 https://github.com/eviatarbach/parasweep
3246 (Last checked: 2021-01)
3249 =head2 DIFFERENCES BETWEEN parallel-bash AND GNU Parallel
3251 Summary (see legend above):
3253 =over
3255 =item I1 I2 - - - - -
3257 =item - - M3 - - M6
3259 =item - O2 O3 - O5 O6 - O8 x O10
3261 =item E1 - - - - - -
3263 =item - - - - - - - - -
3265 =item - -
3267 =back
3269 B<parallel-bash> is written in pure bash. It is really fast (overhead
3270 of ~0.05 ms/job compared to GNU B<parallel>'s 3-10 ms/job). So if your
3271 jobs are extremely short lived, and you can live with the quite
3272 limited command, this may be useful.
3274 It works by making a queue for each process. Then the jobs are
3275 distributed to the queues in a round robin fashion. Finally the queues
3276 are started in parallel. This works fine, if you are lucky, but if
3277 not, all the long jobs may end up in the same queue, so you may see:
3279 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3280 time parallel -P4 sleep {}
3281 (7 seconds)
3282 $ printf "%b\n" 1 1 1 4 1 1 1 4 1 1 1 4 |
3283 time ./parallel-bash.bash -p 4 -c sleep {}
3284 (12 seconds)
3286 Because it uses bash lists, the total number of jobs is limited to
3287 167000..265000 depending on your environment. You get a segmentation
3288 fault, when you reach the limit.
3290 Ctrl-C does not stop spawning new jobs. Ctrl-Z does not suspend
3291 running jobs.
3294 =head3 EXAMPLES FROM parallel-bash
3296 1$ some_input | parallel-bash -p 5 -c echo
3298 1$ some_input | parallel -j 5 echo
3300 2$ parallel-bash -p 5 -c echo < some_file
3302 2$ parallel -j 5 echo < some_file
3304 3$ parallel-bash -p 5 -c echo <<< 'some string'
3306 3$ parallel -j 5 -c echo <<< 'some string'
3308 4$ something | parallel-bash -p 5 -c echo {} {}
3310 4$ something | parallel -j 5 echo {} {}
3312 https://reposhub.com/python/command-line-tools/Akianonymus-parallel-bash.html
3313 (Last checked: 2021-06)
3316 =head2 DIFFERENCES BETWEEN bash-concurrent AND GNU Parallel
3318 B<bash-concurrent> is more an alternative to B<make> than to GNU
3319 B<parallel>. Its input is very similar to a Makefile, where jobs
3320 depend on other jobs.
3322 It has a nice progress indicator where you can see which jobs
3323 completed successfully, which jobs are currently running, which jobs
3324 failed, and which jobs were skipped due to a depending job failed.
3325 The indicator does not deal well with resizing the window.
3327 Output is cached in tempfiles on disk, but is only shown if there is
3328 an error, so it is not meant to be part of a UNIX pipeline. If
3329 B<bash-concurrent> crashes these tempfiles are not removed.
3331 It uses an O(n*n) algorithm, so if you have 1000 independent jobs it
3332 takes 22 seconds to start it.
3334 https://github.com/themattrix/bash-concurrent
3335 (Last checked: 2021-02)
3338 =head2 DIFFERENCES BETWEEN spawntool AND GNU Parallel
3340 Summary (see legend above):
3342 =over
3344 =item I1 - - - - - -
3346 =item M1 - - - - M6
3348 =item - O2 O3 - O5 O6 - x x O10
3350 =item E1 - - - - - -
3352 =item - - - - - - - - -
3354 =item - -
3356 =back
3358 B<spawn> reads a full command line from stdin which it executes in
3359 parallel.
3362 http://code.google.com/p/spawntool/
3363 (Last checked: 2021-07)
3366 =head2 DIFFERENCES BETWEEN go-pssh AND GNU Parallel
3368 Summary (see legend above):
3370 =over
3372 =item - - - - - - -
3374 =item M1 - - - - -
3376 =item O1 - - - - - - x x O10
3378 =item E1 - - - - - -
3380 =item R1 R2 - - - R6 - - -
3382 =item - -
3384 =back
3386 B<go-pssh> does B<ssh> in parallel to multiple machines. It runs the
3387 same command on multiple machines similar to B<--nonall>.
3389 The hostnames must be given as IP-addresses (not as hostnames).
3391 Output is sent to stdout (standard output) if command is successful,
3392 and to stderr (standard error) if the command fails.
3394 =head3 EXAMPLES FROM go-pssh
3396 1$ go-pssh -l <ip>,<ip> -u <user> -p <port> -P <passwd> -c "<command>"
3398 1$ parallel -S 'sshpass -p <passwd> ssh -p <port> <user>@<ip>' \
3399 --nonall "<command>"
3401 2$ go-pssh scp -f host.txt -u <user> -p <port> -P <password> \
3402 -s /local/file_or_directory -d /remote/directory
3404 2$ parallel --nonall --slf host.txt \
3405 --basefile /local/file_or_directory/./ --wd /remote/directory
3406 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3408 3$ go-pssh scp -l <ip>,<ip> -u <user> -p <port> -P <password> \
3409 -s /local/file_or_directory -d /remote/directory
3411 3$ parallel --nonall -S <ip>,<ip> \
3412 --basefile /local/file_or_directory/./ --wd /remote/directory
3413 --ssh 'sshpass -p <password> ssh -p <port> -l <user>' true
3415 https://github.com/xuchenCN/go-pssh
3416 (Last checked: 2021-07)
3419 =head2 DIFFERENCES BETWEEN go-parallel AND GNU Parallel
3421 Summary (see legend above):
3423 =over
3425 =item I1 I2 - - - - I7
3427 =item - - M3 - - M6
3429 =item - O2 O3 - O5 - - x x - O10
3431 =item E1 - - E4 - - -
3433 =item - - - - - - - - -
3435 =item - -
3437 =back
3439 B<go-parallel> uses Go templates for replacement strings. Quite
3440 similar to the I<{= perl expr =}> replacement string.
3442 =head3 EXAMPLES FROM go-parallel
3444 1$ go-parallel -a ./files.txt -t 'cp {{.Input}} {{.Input | dirname | dirname}}'
3446 1$ parallel -a ./files.txt cp {} '{= $_=::dirname(::dirname($_)) =}'
3448 2$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{noExt .Input}}'
3450 2$ parallel -a ./files.txt echo mkdir -p {} {.}
3452 3$ go-parallel -a ./files.txt -t 'mkdir -p {{.Input}} {{.Input | basename | noExt}}'
3454 3$ parallel -a ./files.txt echo mkdir -p {} {/.}
3456 https://github.com/mylanconnolly/parallel
3457 (Last checked: 2021-07)
3460 =head2 DIFFERENCES BETWEEN p AND GNU Parallel
3462 Summary (see legend above):
3464 =over
3466 =item - - - I4 - - x
3468 =item - - - - - M6
3470 =item - O2 O3 - O5 O6 - x x - O10
3472 =item E1 - - - - - -
3474 =item - - - - - - - - -
3476 =item - -
3478 =back
3480 B<p> is a tiny shell script. It can color output with some predefined
3481 colors, but is otherwise quite limited.
3483 It maxes out at around 116000 jobs (probably due to limitations in Bash).
3485 =head3 EXAMPLES FROM p
3487 Some of the examples from B<p> cannot be implemented 100% by GNU
3488 B<parallel>: The coloring is a bit different, and GNU B<parallel>
3489 cannot have B<--tag> for some inputs and not for others.
3491 The coloring done by GNU B<parallel> is not exactly the same as B<p>.
3493 1$ p -bc blue "ping 127.0.0.1" -uc red "ping 192.168.0.1" \
3494 -rc yellow "ping 192.168.1.1" -t example "ping example.com"
3496 1$ parallel --lb -j0 --color --tag ping \
3497 ::: 127.0.0.1 192.168.0.1 192.168.1.1 example.com
3499 2$ p "tail -f /var/log/httpd/access_log" \
3500 -bc red "tail -f /var/log/httpd/error_log"
3502 2$ cd /var/log/httpd;
3503 parallel --lb --color --tag tail -f ::: access_log error_log
3505 3$ p tail -f "some file" \& p tail -f "other file with space.txt"
3507 3$ parallel --lb tail -f ::: 'some file' "other file with space.txt"
3509 4$ p -t project1 "hg pull project1" -t project2 \
3510 "hg pull project2" -t project3 "hg pull project3"
3512 4$ parallel --lb hg pull ::: project{1..3}
3514 https://github.com/rudymatela/evenmoreutils/blob/master/man/p.1.adoc
3515 (Last checked: 2022-04)
3518 =head2 DIFFERENCES BETWEEN senechal AND GNU Parallel
3520 Summary (see legend above):
3522 =over
3524 =item I1 - - - - - -
3526 =item M1 - M3 - - M6
3528 =item O1 - O3 O4 - - - x x -
3530 =item E1 - - - - - -
3532 =item - - - - - - - - -
3534 =item - -
3536 =back
3538 B<seneschal> only starts the first job after reading the last job, and
3539 output from the first job is only printed after the last job finishes.
3541 1 byte of output requites 3.5 bytes of RAM.
3543 This makes it impossible to have a total output bigger than the
3544 virtual memory.
3546 Even though output is kept in RAM outputing is quite slow: 30 MB/s.
3548 Output larger than 4 GB causes random problems - it looks like a race
3549 condition.
3551 This:
3553 echo 1 | seneschal --prefix='yes `seq 1000`|head -c 1G' >/dev/null
3555 takes 4100(!) CPU seconds to run on a 64C64T server, but only 140 CPU
3556 seconds on a 4C8T laptop. So it looks like B<seneschal> wastes a lot
3557 of CPU time coordinating the CPUs.
3559 Compare this to:
3561 echo 1 | time -v parallel -N0 'yes `seq 1000`|head -c 1G' >/dev/null
3563 which takes 3-8 CPU seconds.
3565 =head3 EXAMPLES FROM seneschal README.md
3567 1$ echo $REPOS | seneschal --prefix="cd {} && git pull"
3569 # If $REPOS is newline separated
3570 1$ echo "$REPOS" | parallel -k "cd {} && git pull"
3571 # If $REPOS is space separated
3572 1$ echo -n "$REPOS" | parallel -d' ' -k "cd {} && git pull"
3574 COMMANDS="pwd
3575 sleep 5 && echo boom
3576 echo Howdy
3577 whoami"
3579 2$ echo "$COMMANDS" | seneschal --debug
3581 2$ echo "$COMMANDS" | parallel -k -v
3583 3$ ls -1 | seneschal --prefix="pushd {}; git pull; popd;"
3585 3$ ls -1 | parallel -k "pushd {}; git pull; popd;"
3586 # Or if current dir also contains files:
3587 3$ parallel -k "pushd {}; git pull; popd;" ::: */
3589 https://github.com/TheWizardTower/seneschal
3590 (Last checked: 2022-06)
3593 =head2 DIFFERENCES BETWEEN async AND GNU Parallel
3595 Summary (see legend above):
3597 =over
3599 =item x x x x x x x
3601 =item - x x x x x
3603 =item x O2 O3 O4 O5 O6 - x x O10
3605 =item E1 - - E4 - - -
3607 =item - - - - - - - - -
3609 =item S1 S2
3611 =back
3613 B<async> works like B<sem>.
3616 =head3 EXAMPLES FROM async
3618 1$ S="/tmp/example_socket"
3620 async -s="$S" server --start
3622 for i in {1..20}; do
3623 # prints command output to stdout
3624 async -s="$S" cmd -- bash -c "sleep 1 && echo test $i"
3625 done
3627 # wait until all commands are finished
3628 async -s="$S" wait
3630 1$ S="example_id"
3632 # server not needed
3634 for i in {1..20}; do
3635 # prints command output to stdout
3636 sem --bg --id "$S" -j100% "sleep 1 && echo test $i"
3637 done
3639 # wait until all commands are finished
3640 sem --fg --id "$S" --wait
3642 2$ # configure the server to run four commands in parallel
3643 async -s="$S" server -j4
3645 mkdir "/tmp/ex_dir"
3646 for i in {21..40}; do
3647 # redirects command output to /tmp/ex_dir/file*
3648 async -s="$S" cmd -o "/tmp/ex_dir/file$i" -- \
3649 bash -c "sleep 1 && echo test $i"
3650 done
3652 async -s="$S" wait
3654 # stops server
3655 async -s="$S" server --stop
3657 2$ # starting server not needed
3659 mkdir "/tmp/ex_dir"
3660 for i in {21..40}; do
3661 # redirects command output to /tmp/ex_dir/file*
3662 sem --bg --id "$S" --results "/tmp/ex_dir/file$i{}" \
3663 "sleep 1 && echo test $i"
3664 done
3666 sem --fg --id "$S" --wait
3668 # there is no server to stop
3670 https://github.com/ctbur/async
3671 (Last checked: 2023-01)
3674 =head2 DIFFERENCES BETWEEN tandem AND GNU Parallel
3676 Summary (see legend above):
3678 =over
3680 =item - - - I4 - - x
3682 =item M1 - - - - M6
3684 =item - - O3 - - - - x - -
3686 =item E1 - E3 - E5 - -
3688 =item - - - - - - - - -
3690 =item - -
3692 =back
3694 B<tandem> runs full commands in parallel. It is made for starting a
3695 "server", running a job against the server, and when the job is done,
3696 the server is killed.
3698 More generally: it kills all jobs when the first job completes -
3699 similar to '--halt now,done=1'.
3701 B<tandem> silently discards some output. It is unclear exactly when
3702 this happens. It looks like a race condition, because it varies for
3703 each run.
3705 $ tandem "seq 10000" | wc -l
3706 6731 <- This should always be 10002
3709 =head3 EXAMPLES FROM Demo
3711 tandem \
3712 'php -S localhost:8000' \
3713 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3714 'tailwind -i src/index.css -o dist/index.css --watch'
3716 # Emulate tandem's behaviour
3717 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3718 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3719 export PARALLEL
3721 parallel ::: \
3722 'php -S localhost:8000' \
3723 'esbuild src/*.ts --bundle --outdir=dist --watch' \
3724 'tailwind -i src/index.css -o dist/index.css --watch'
3727 =head3 EXAMPLES FROM tandem -h
3729 # Emulate tandem's behaviour
3730 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3731 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3732 export PARALLEL
3734 1$ tandem 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3736 1$ parallel ::: 'sleep 5 && echo "hello"' 'sleep 2 && echo "world"'
3738 # '-t 0' fails. But '--timeout 0 works'
3739 2$ tandem --timeout 0 'sleep 5 && echo "hello"' \
3740 'sleep 2 && echo "world"'
3742 2$ parallel --timeout 0 ::: 'sleep 5 && echo "hello"' \
3743 'sleep 2 && echo "world"'
3745 =head3 EXAMPLES FROM tandem's readme.md
3747 # Emulate tandem's behaviour
3748 PARALLEL='--color --lb --halt now,done=1 --tagstring '
3749 PARALLEL="$PARALLEL'"'{=s/ .*//; $_.=".".$app{$_}++;=}'"'"
3750 export PARALLEL
3752 1$ tandem 'next dev' 'nodemon --quiet ./server.js'
3754 1$ parallel ::: 'next dev' 'nodemon --quiet ./server.js'
3756 2$ cat package.json
3758 "scripts": {
3759 "dev:php": "...",
3760 "dev:js": "...",
3761 "dev:css": "..."
3765 tandem 'npm:dev:php' 'npm:dev:js' 'npm:dev:css'
3767 # GNU Parallel uses bash functions instead
3768 2$ cat package.sh
3769 dev:php() { ... ; }
3770 dev:js() { ... ; }
3771 dev:css() { ... ; }
3772 export -f dev:php dev:js dev:css
3774 . package.sh
3775 parallel ::: dev:php dev:js dev:css
3777 3$ tandem 'npm:dev:*'
3779 3$ compgen -A function | grep ^dev: | parallel
3781 For usage in Makefiles, include a copy of GNU Parallel with your
3782 source using `parallel --embed`. This has the added benefit of also
3783 working if access to the internet is down or restricted.
3785 https://github.com/rosszurowski/tandem
3786 (Last checked: 2023-01)
3789 =head2 DIFFERENCES BETWEEN rust-parallel(aaronriekenberg) AND GNU Parallel
3791 Summary (see legend above):
3793 =over
3795 =item I1 I2 I3 - - - -
3797 =item - - - - - M6
3799 =item O1 O2 O3 - O5 O6 - x - O10
3801 =item E1 - - E4 - - -
3803 =item - - - - - - - - -
3805 =item - -
3807 =back
3809 B<rust-parallel> has a goal of only using Rust. It seems it is
3810 impossible to call bash functions from the command line. You would
3811 need to put these in a script.
3813 Calling a script that misses the shebang line (#! as first line)
3814 fails.
3816 =head3 EXAMPLES FROM rust-parallel's README.md
3818 $ cat >./test <<EOL
3819 echo hi
3820 echo there
3821 echo how
3822 echo are
3823 echo you
3826 1$ cat test | rust-parallel -j5
3828 1$ cat test | parallel -j5
3830 2$ cat test | rust-parallel -j1
3832 2$ cat test | parallel -j1
3834 3$ head -100 /usr/share/dict/words | rust-parallel md5 -s
3836 3$ head -100 /usr/share/dict/words | parallel md5 -s
3838 4$ find . -type f -print0 | rust-parallel -0 gzip -f -k
3840 4$ find . -type f -print0 | parallel -0 gzip -f -k
3842 5$ head -100 /usr/share/dict/words |
3843 awk '{printf "md5 -s %s\n", $1}' | rust-parallel
3845 5$ head -100 /usr/share/dict/words |
3846 awk '{printf "md5 -s %s\n", $1}' | parallel
3848 6$ head -100 /usr/share/dict/words | rust-parallel md5 -s |
3849 grep -i abba
3851 6$ head -100 /usr/share/dict/words | parallel md5 -s |
3852 grep -i abba
3854 https://github.com/aaronriekenberg/rust-parallel
3855 (Last checked: 2023-01)
3858 =head2 DIFFERENCES BETWEEN parallelium AND GNU Parallel
3860 Summary (see legend above):
3862 =over
3864 =item - I2 - - - - -
3866 =item M1 - - - - M6
3868 =item O1 - O3 - - - - x - -
3870 =item E1 - - E4 - - -
3872 =item - - - - - - - - -
3874 =item - -
3876 =back
3878 B<parallelium> merges standard output (stdout) and standard error
3879 (stderr). The maximal output of a command is 8192 bytes. Bigger output
3880 makes B<parallelium> go into an infinite loop.
3882 In the input file for B<parallelium> you can define a tag, so that you
3883 can select to run only these commands. A bit like a target in a
3884 Makefile.
3886 Progress is printed on standard output (stdout) prepended with '#'
3887 with similar information as GNU B<parallel>'s B<--bar>.
3889 =head3 EXAMPLES
3891 $ cat testjobs.txt
3892 #tag common sleeps classA
3893 (sleep 4.495;echo "job 000")
3895 (sleep 2.587;echo "job 016")
3897 #tag common sleeps classB
3898 (sleep 0.218;echo "job 017")
3900 (sleep 2.269;echo "job 040")
3902 #tag common sleeps classC
3903 (sleep 2.586;echo "job 041")
3905 (sleep 1.626;echo "job 099")
3907 #tag lasthalf, sleeps, classB
3908 (sleep 1.540;echo "job 100")
3910 (sleep 2.001;echo "job 199")
3912 1$ parallelium -f testjobs.txt -l logdir -t classB,classC
3914 1$ cat testjobs.txt |
3915 parallel --plus --results logdir/testjobs.txt_{0#}.output \
3916 '{= if(/^#tag /) { @tag = split/,|\s+/ }
3917 (grep /^(classB|classC)$/, @tag) or skip =}'
3919 https://github.com/beomagi/parallelium
3920 (Last checked: 2023-01)
3923 =head2 DIFFERENCES BETWEEN forkrun AND GNU Parallel
3925 Summary (see legend above):
3927 =over
3929 =item I1 - - - - - I7
3931 =item - - - - - -
3933 =item - O2 O3 - O5 - - - - O10
3935 =item E1 - - E4 - - -
3937 =item - - - - - - - - -
3939 =item - -
3941 =back
3944 B<forkrun> blocks if it receives fewer jobs than slots:
3946 echo | forkrun -p 2 echo
3948 or when it gets some specific commands e.g.:
3950 f() { seq "$@" | pv -qL 3; }
3951 seq 10 | forkrun f
3953 It is not clear why.
3955 It is faster than GNU B<parallel> (overhead: 1.2 ms/job vs 3 ms/job),
3956 but way slower than B<parallel-bash> (0.059 ms/job).
3958 Running jobs cannot be stopped by pressing CTRL-C.
3960 B<-k> is supposed to keep the order but fails on the MIX testing
3961 example below. If used with B<-k> it caches output in RAM.
3963 If B<forkrun> is killed, it leaves temporary files in
3964 B</tmp/.forkrun.*> that has to be cleaned up manually.
3966 =head3 EXAMPLES
3968 1$ time find ./ -type f |
3969 forkrun -l512 -- sha256sum 2>/dev/null | wc -l
3970 1$ time find ./ -type f |
3971 parallel -j28 -m -- sha256sum 2>/dev/null | wc -l
3973 2$ time find ./ -type f |
3974 forkrun -l512 -k -- sha256sum 2>/dev/null | wc -l
3975 2$ time find ./ -type f |
3976 parallel -j28 -k -m -- sha256sum 2>/dev/null | wc -l
3978 https://github.com/jkool702/forkrun
3979 (Last checked: 2023-02)
3982 =head2 DIFFERENCES BETWEEN parallel-sh AND GNU Parallel
3984 Summary (see legend above):
3986 =over
3988 =item I1 I2 - I4 - - -
3990 =item M1 - - - - M6
3992 =item O1 O2 O3 - O5 O6 - - - O10
3994 =item E1 - - E4 - - -
3996 =item - - - - - - - - -
3998 =item - -
4000 =back
4002 B<parallel-sh> buffers in RAM. The buffering data takes O(n^1.5) time:
4004 2MB=0.107s 4MB=0.175s 8MB=0.342s 16MB=0.766s 32MB=2.2s 64MB=6.7s
4005 128MB=20s 256MB=64s 512MB=248s 1024MB=998s 2048MB=3756s
4007 It limits the practical usability to jobs outputting < 256 MB. GNU
4008 B<parallel> buffers on disk, yet is faster for jobs with outputs > 16
4009 MB and is only limited by the free space in $TMPDIR.
4011 B<parallel-sh> can kill running jobs if a job fails (Similar to
4012 B<--halt now,fail=1>).
4014 =head3 EXAMPLES
4016 1$ parallel-sh "sleep 2 && echo first" "sleep 1 && echo second"
4018 1$ parallel ::: "sleep 2 && echo first" "sleep 1 && echo second"
4020 2$ cat /tmp/commands
4021 sleep 2 && echo first
4022 sleep 1 && echo second
4024 2$ parallel-sh -f /tmp/commands
4026 2$ parallel -a /tmp/commands
4028 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4029 parallel-sh
4031 3$ echo -e 'sleep 2 && echo first\nsleep 1 && echo second' |
4032 parallel
4034 https://github.com/thyrc/parallel-sh
4035 (Last checked: 2023-04)
4038 =head2 DIFFERENCES BETWEEN bash-parallel AND GNU Parallel
4040 Summary (see legend above):
4042 =over
4044 =item - I2 - - - - I7
4046 =item M1 - M3 - M5 M6
4048 =item - O2 O3 - - O6 - O8 - O10
4050 =item E1 - - - - - -
4052 =item - - - - - - - - -
4054 =item - -
4056 =back
4058 B<bash-parallel> is not as much a command as it is a shell script that
4059 you have to alter. It requires you to change the shell function
4060 process_job that runs the job, and set $MAX_POOL_SIZE to the number of
4061 jobs to run in parallel.
4063 It is half as fast as GNU B<parallel> for short jobs.
4065 https://github.com/thilinaba/bash-parallel
4066 (Last checked: 2023-05)
4069 =head2 DIFFERENCES BETWEEN PaSH AND GNU Parallel
4071 Summary (see legend above): N/A
4073 B<pash> is quite different from GNU B<parallel>. It is not a general
4074 parallelizer. It takes a shell script and analyses it and parallelizes
4075 parts of it by replacing the parts with commands that will give the same
4076 result.
4078 This will replace B<sort> with a command that does pretty much the
4079 same as B<parsort --parallel=8> (except somewhat slower):
4081 pa.sh --width 8 -c 'cat bigfile | sort'
4083 However, even a simple change will confuse B<pash> and you will get no
4084 parallelization:
4086 pa.sh --width 8 -c 'mysort() { sort; }; cat bigfile | mysort'
4087 pa.sh --width 8 -c 'cat bigfile | sort | md5sum'
4089 From the source it seems B<pash> only looks at: awk cat col comm cut
4090 diff grep head mkfifo mv rm sed seq sort tail tee tr uniq wc xargs
4092 For pipelines where these commands are bottlenecks, it might be worth
4093 testing if B<pash> is faster than GNU B<parallel>.
4095 B<pash> does not respect $TMPDIR but always uses /tmp. If B<pash> dies
4096 unexpectantly it does not clean up.
4098 https://github.com/binpash/pash
4099 (Last checked: 2023-05)
4102 =head2 DIFFERENCES BETWEEN korovkin-parallel AND GNU Parallel
4104 Summary (see legend above):
4106 =over
4108 =item I1 - - - - - -
4110 =item M1 - - - - M6
4112 =item - - O3 - - - - x x -
4114 =item E1 - - - - - -
4116 =item R1 - - - - R6 x x -
4118 =item - -
4120 =back
4122 B<korovkin-parallel> prepends all lines with some info.
4124 The output is colored with 6 color combinations, so job 1 and 7 will
4125 get the same color.
4127 You can get similar output with:
4129 (echo ...) |
4130 parallel --color -j 10 --lb --tagstring \
4131 '[l:{#}:{=$_=sprintf("%7.03f",::now()-$^T)=} {=$_=hh_mm_ss($^T)=} {%}]'
4133 Lines longer than 8192 chars are broken into lines shorter than
4134 8192. B<korovkin-parallel> loses the last char for lines exactly 8193
4135 chars long.
4137 Short lines from different jobs do not mix, but long lines do:
4139 fun() {
4140 perl -e '$a="'$1'"x1000000; for(1..'$2') { print $a };';
4141 echo;
4143 export -f fun
4144 (echo fun a 100;echo fun b 100) | korovkin-parallel | tr -s abcdef
4145 # Compare to:
4146 (echo fun a 100;echo fun b 100) | parallel | tr -s abcdef
4148 There should be only one line of a's and one line of b's.
4150 Just like GNU B<parallel> B<korovkin-parallel> offers a master/slave
4151 model, so workers on other servers can do some of the tasks. But
4152 contrary to GNU B<parallel> you must manually start workers on these
4153 servers. The communication is neither authenticated nor encrypted.
4155 It caches output in RAM: a 1GB line uses ~2.5GB RAM
4157 https://github.com/korovkin/parallel
4158 (Last checked: 2023-07)
4161 =head2 DIFFERENCES BETWEEN xe AND GNU Parallel
4163 Summary (see legend above):
4165 =over
4167 =item I1 I2 - I4 - - I7
4169 =item M1 - M3 M4 - M6
4171 =item - O2 O3 - O5 O6 - O8 - O10
4173 =item E1 - - E4 - - -
4175 =item - - - - - - - - -
4177 =item - -
4179 =back
4181 B<xe> has a peculiar limitation:
4183 echo /bin/echo | xe {} OK
4184 echo echo | xe /bin/{} fails
4187 =head3 EXAMPLES
4189 Compress all .c files in the current directory, using all CPU cores:
4191 1$ xe -a -j0 gzip -- *.c
4193 1$ parallel gzip ::: *.c
4195 Remove all empty files, using lr(1):
4197 2$ lr -U -t 'size == 0' | xe -N0 rm
4199 2$ lr -U -t 'size == 0' | parallel -X rm
4201 Convert .mp3 to .ogg, using all CPU cores:
4203 3$ xe -a -j0 -s 'ffmpeg -i "${1}" "${1%.mp3}.ogg"' -- *.mp3
4205 3$ parallel ffmpeg -i {} {.}.ogg ::: *.mp3
4207 Same, using percent rules:
4209 4$ xe -a -j0 -p %.mp3 ffmpeg -i %.mp3 %.ogg -- *.mp3
4211 4$ parallel --rpl '% s/\.mp3// or skip' ffmpeg -i %.mp3 %.ogg ::: *.mp3
4213 Similar, but hiding output of ffmpeg, instead showing spawned jobs:
4215 5$ xe -ap -j0 -vvq '%.{m4a,ogg,opus}' ffmpeg -y -i {} out/%.mp3 -- *
4217 5$ parallel -v --rpl '% s/\.(m4a|ogg|opus)// or skip' \
4218 ffmpeg -y -i {} out/%.mp3 '2>/dev/null' ::: *
4220 5$ parallel -v ffmpeg -y -i {} out/{.}.mp3 '2>/dev/null' ::: *
4222 https://github.com/leahneukirchen/xe
4223 (Last checked: 2023-08)
4226 =head2 DIFFERENCES BETWEEN sp AND GNU Parallel
4228 Summary (see legend above):
4230 =over
4232 =item - - - I4 - - -
4234 =item M1 - M3 - - M6
4236 =item - O2 O3 - O5 (O6) - x x O10
4238 =item E1 - - - - - -
4240 =item - - - - - - - - -
4242 =item - -
4244 =back
4246 B<sp> has very few options.
4248 It can either be used like:
4250 sp command {} option :: arg1 arg2 arg3
4252 which is similar to:
4254 parallel command {} option ::: arg1 arg2 arg3
4258 sp command1 :: "command2 -option" :: "command3 foo bar"
4260 which is similar to:
4262 parallel ::: command1 "command2 -option" "command3 foo bar"
4264 B<sp> deals badly with too many commands: This causes B<sp> to run out
4265 of file handles and gives data loss.
4267 For each command that fails, B<sp> will print an error message on
4268 stderr (standard error).
4270 You cannot used exported shell functions as commands.
4272 =head3 EXAMPLES
4274 1$ sp echo {} :: 1 2 3
4276 1$ parallel echo {} ::: 1 2 3
4278 2$ sp echo {} {} :: 1 2 3
4280 2$ parallel echo {} {} :: 1 2 3
4282 3$ sp echo 1 :: echo 2 :: echo 3
4284 3$ parallel ::: 'echo 1' 'echo 2' 'echo 3'
4286 4$ sp a foo bar :: "b 'baz bar'" :: c
4288 4$ parallel ::: 'a foo bar' "b 'baz bar'" :: c
4290 https://github.com/SergioBenitez/sp
4291 (Last checked: 2023-10)
4294 =head2 DIFFERENCES BETWEEN repeater AND GNU Parallel
4296 Summary (see legend above):
4298 =over
4300 =item - - - - - - -
4302 =item - - - - - -
4304 =item - O2 O3 N/A - O6 - x x ?O10
4306 =item E1 - - - E5 - -
4308 =item - - - - - - - - -
4310 =item - -
4312 =back
4314 B<repeater> runs the same job repeatedly. In other words: It does not
4315 read arguments, thus is it an alternative for GNU B<parallel> for only
4316 quite limited applications.
4318 B<repeater> has an overhead of around 0.23 ms/job. Compared to GNU
4319 B<parallel>'s 2-3 ms this is fast. Compared to B<bash-parallel>'s 0.05
4320 ms/job it is slow.
4322 =head3 Memory use and run time for large output
4324 Output takes O(n^2) time for output of size n. 10 MB takes ~1 second,
4325 30 MB takes ~7 seconds, 100 MB takes ~60 seconds, 300 MB takes ~480
4326 seconds, 1000 MB takes ~10000 seconds.
4328 100 MB of output takes around 1 GB of RAM.
4330 # Run time = 15 sec
4331 # Memory use = 20 MB
4332 # Output = 1 GB per job
4333 \time -v parallel -j1 seq ::: 120000000 120000000 >/dev/null
4335 # Run time = 4.7 sec
4336 # Memory use = 95 MB
4337 # Output = 8 MB per job
4338 \time -v repeater -w 1 -n 2 -reportFile ./run_output seq 1200000 >/dev/null
4340 # Run time = 42 sec
4341 # Memory use = 277 MB
4342 # Output = 27 MB per job
4343 \time -v repeater -w 1 -n 2 -reportFile ./run_output seq 3600000 >/dev/null
4345 # Run time = 530 sec
4346 # Memory use = 1000 MB
4347 # Output = 97 MB per job
4348 \time -v repeater -w 1 -n 2 -reportFile ./run_output seq 12000000 >/dev/null
4350 # Run time = 2h41m
4351 # Memory use = 8.6 GB
4352 # Output = 1 GB per job
4353 \time -v repeater -w 1 -n 2 -reportFile ./run_output seq 120000000 >/dev/null
4355 For even just moderate sized outputs GNU B<parallel> will be faster
4356 and use less memory.
4359 =head3 EXAMPLES
4361 1$ repeater -n 100 -w 10 -reportFile ./run_output
4362 -output REPORT_FILE -progress BOTH curl example.com
4364 1$ seq 100 | parallel --joblog run.log --eta curl example.com > output
4366 2$ repeater -n 100 -increment -progress HIDDEN -reportFile foo
4367 echo "this is increment: " INC
4368 2$ seq 100 | parallel echo {}
4369 2$ seq 100 | parallel echo '{= $_ = ++$myvar =}'
4371 https://github.com/baalimago/repeater
4372 (Last checked: 2023-12)
4375 =head2 DIFFERENCES BETWEEN parallelize AND GNU Parallel
4377 Summary (see legend above):
4379 =over
4381 =item I1 - - - - - I7
4383 =item - - - - - M6
4385 =item O1 - O3 O4 O5 - O7 - - -
4387 =item E1 - - E4 - - -
4389 =item - - - - - - - - -
4391 =item - -
4393 =back
4395 B<parallelize> runs the full line as a command. If the command is not
4396 found, there is no warning.
4398 The output at most ~1000000 lines/s. If the lines are short this is
4399 quite slow. The lines can at most be 2047999 bytes long. Longer lines
4400 cause segfault.
4403 =head3 EXAMPLES
4405 simple.dat:
4407 sleep 5
4409 cat alire.toml
4410 loc src/parallelize.adb
4411 sh loc src/*.ad?
4413 1$ bin/parallelize -v <simple.dat
4415 1$ parallel <simple.dat
4417 https://github.com/simonjwright/parallelize
4418 (Last checked: 2024-04)
4420 =head2 Todo
4422 https://github.com/justanhduc/task-spooler
4424 https://manpages.ubuntu.com/manpages/xenial/man1/tsp.1.html
4426 https://www.npmjs.com/package/concurrently
4428 http://code.google.com/p/push/ (cannot compile)
4430 https://github.com/krashanoff/parallel
4432 https://github.com/Nukesor/pueue
4434 https://arxiv.org/pdf/2012.15443.pdf KumQuat
4436 https://github.com/JeiKeiLim/simple_distribute_job
4438 https://github.com/reggi/pkgrun - not obvious how to use
4440 https://github.com/benoror/better-npm-run - not obvious how to use
4442 https://github.com/bahmutov/with-package
4444 https://github.com/flesler/parallel
4446 https://github.com/Julian/Verge
4448 https://vicerveza.homeunix.net/~viric/soft/ts/
4450 https://github.com/chapmanjacobd/que
4454 =head1 TESTING OTHER TOOLS
4456 There are certain issues that are very common on parallelizing
4457 tools. Here are a few stress tests. Be warned: If the tool is badly
4458 coded it may overload your machine.
4461 =head2 MIX: Output mixes
4463 Output from 2 jobs should not mix. If the output is not used, this
4464 does not matter; but if the output I<is> used then it is important
4465 that you do not get half a line from one job followed by half a line
4466 from another job.
4468 If the tool does not buffer, output will most likely mix now and then.
4470 This test stresses whether output mixes.
4472 #!/bin/bash
4474 paralleltool="parallel -j 30"
4476 cat <<-EOF > mycommand
4477 #!/bin/bash
4479 # If a, b, c, d, e, and f mix: Very bad
4480 perl -e 'print STDOUT "a"x3000_000," "'
4481 perl -e 'print STDERR "b"x3000_000," "'
4482 perl -e 'print STDOUT "c"x3000_000," "'
4483 perl -e 'print STDERR "d"x3000_000," "'
4484 perl -e 'print STDOUT "e"x3000_000," "'
4485 perl -e 'print STDERR "f"x3000_000," "'
4486 echo
4487 echo >&2
4489 chmod +x mycommand
4491 # Run 30 jobs in parallel
4492 seq 30 |
4493 $paralleltool ./mycommand > >(tr -s abcdef) 2> >(tr -s abcdef >&2)
4495 # 'a c e' and 'b d f' should always stay together
4496 # and there should only be a single line per job
4499 =head2 STDERRMERGE: Stderr is merged with stdout
4501 Output from stdout and stderr should not be merged, but kept separated.
4503 This test shows whether stdout is mixed with stderr.
4505 #!/bin/bash
4507 paralleltool="parallel -j0"
4509 cat <<-EOF > mycommand
4510 #!/bin/bash
4512 echo stdout
4513 echo stderr >&2
4514 echo stdout
4515 echo stderr >&2
4517 chmod +x mycommand
4519 # Run one job
4520 echo |
4521 $paralleltool ./mycommand > stdout 2> stderr
4522 cat stdout
4523 cat stderr
4526 =head2 RAM: Output limited by RAM
4528 Some tools cache output in RAM. This makes them extremely slow if the
4529 output is bigger than physical memory and crash if the output is
4530 bigger than the virtual memory.
4532 #!/bin/bash
4534 paralleltool="parallel -j0"
4536 cat <<'EOF' > mycommand
4537 #!/bin/bash
4539 # Generate 1 GB output
4540 yes "`perl -e 'print \"c\"x30_000'`" | head -c 1G
4542 chmod +x mycommand
4544 # Run 20 jobs in parallel
4545 # Adjust 20 to be > physical RAM and < free space on /tmp
4546 seq 20 | time $paralleltool ./mycommand | wc -c
4549 =head2 DISKFULL: Incomplete data if /tmp runs full
4551 If caching is done on disk, the disk can run full during the run. Not
4552 all programs discover this. GNU Parallel discovers it, if it stays
4553 full for at least 2 seconds.
4555 #!/bin/bash
4557 paralleltool="parallel -j0"
4559 # This should be a dir with less than 100 GB free space
4560 smalldisk=/tmp/shm/parallel
4562 TMPDIR="$smalldisk"
4563 export TMPDIR
4565 max_output() {
4566 # Force worst case scenario:
4567 # Make GNU Parallel only check once per second
4568 sleep 10
4569 # Generate 100 GB to fill $TMPDIR
4570 # Adjust if /tmp is bigger than 100 GB
4571 yes | head -c 100G >$TMPDIR/$$
4572 # Generate 10 MB output that will not be buffered
4573 # due to full disk
4574 perl -e 'print "X"x10_000_000' | head -c 10M
4575 echo This part is missing from incomplete output
4576 sleep 2
4577 rm $TMPDIR/$$
4578 echo Final output
4581 export -f max_output
4582 seq 10 | $paralleltool max_output | tr -s X
4585 =head2 CLEANUP: Leaving tmp files at unexpected death
4587 Some tools do not clean up tmp files if they are killed. If the tool
4588 buffers on disk, they may not clean up, if they are killed.
4590 #!/bin/bash
4592 paralleltool=parallel
4594 ls /tmp >/tmp/before
4595 seq 10 | $paralleltool sleep &
4596 pid=$!
4597 # Give the tool time to start up
4598 sleep 1
4599 # Kill it without giving it a chance to cleanup
4600 kill -9 $!
4601 # Should be empty: No files should be left behind
4602 diff <(ls /tmp) /tmp/before
4605 =head2 SPCCHAR: Dealing badly with special file names.
4607 It is not uncommon for users to create files like:
4609 My brother's 12" *** record (costs $$$).jpg
4611 Some tools break on this.
4613 #!/bin/bash
4615 paralleltool=parallel
4617 touch "My brother's 12\" *** record (costs \$\$\$).jpg"
4618 ls My*jpg | $paralleltool ls -l
4621 =head2 COMPOSED: Composed commands do not work
4623 Some tools require you to wrap composed commands into B<bash -c>.
4625 echo bar | $paralleltool echo foo';' echo {}
4628 =head2 ONEREP: Only one replacement string allowed
4630 Some tools can only insert the argument once.
4632 echo bar | $paralleltool echo {} foo {}
4635 =head2 INPUTSIZE: Length of input should not be limited
4637 Some tools limit the length of the input lines artificially with no good
4638 reason. GNU B<parallel> does not:
4640 perl -e 'print "foo."."x"x100_000_000' | parallel echo {.}
4642 GNU B<parallel> limits the command to run to 128 KB due to execve(1):
4644 perl -e 'print "x"x131_000' | parallel echo {} | wc
4647 =head2 NUMWORDS: Speed depends on number of words
4649 Some tools become very slow if output lines have many words.
4651 #!/bin/bash
4653 paralleltool=parallel
4655 cat <<-EOF > mycommand
4656 #!/bin/bash
4658 # 10 MB of lines with 1000 words
4659 yes "`seq 1000`" | head -c 10M
4661 chmod +x mycommand
4663 # Run 30 jobs in parallel
4664 seq 30 | time $paralleltool -j0 ./mycommand > /dev/null
4666 =head2 4GB: Output with a line > 4GB should be OK
4668 #!/bin/bash
4670 paralleltool="parallel -j0"
4672 cat <<-EOF > mycommand
4673 #!/bin/bash
4675 perl -e '\$a="a"x1000_000; for(1..5000) { print \$a }'
4677 chmod +x mycommand
4679 # Run 1 job
4680 seq 1 | $paralleltool ./mycommand | LC_ALL=C wc
4683 =head1 AUTHOR
4685 When using GNU B<parallel> for a publication please cite:
4687 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
4688 The USENIX Magazine, February 2011:42-47.
4690 This helps funding further development; and it won't cost you a cent.
4691 If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
4693 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
4695 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
4697 Copyright (C) 2010-2024 Ole Tange, http://ole.tange.dk and Free
4698 Software Foundation, Inc.
4700 Parts of the manual concerning B<xargs> compatibility is inspired by
4701 the manual of B<xargs> from GNU findutils 4.4.2.
4704 =head1 LICENSE
4706 This program is free software; you can redistribute it and/or modify
4707 it under the terms of the GNU General Public License as published by
4708 the Free Software Foundation; either version 3 of the License, or
4709 at your option any later version.
4711 This program is distributed in the hope that it will be useful,
4712 but WITHOUT ANY WARRANTY; without even the implied warranty of
4713 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
4714 GNU General Public License for more details.
4716 You should have received a copy of the GNU General Public License
4717 along with this program. If not, see <https://www.gnu.org/licenses/>.
4719 =head2 Documentation license I
4721 Permission is granted to copy, distribute and/or modify this
4722 documentation under the terms of the GNU Free Documentation License,
4723 Version 1.3 or any later version published by the Free Software
4724 Foundation; with no Invariant Sections, with no Front-Cover Texts, and
4725 with no Back-Cover Texts. A copy of the license is included in the
4726 file LICENSES/GFDL-1.3-or-later.txt.
4728 =head2 Documentation license II
4730 You are free:
4732 =over 9
4734 =item B<to Share>
4736 to copy, distribute and transmit the work
4738 =item B<to Remix>
4740 to adapt the work
4742 =back
4744 Under the following conditions:
4746 =over 9
4748 =item B<Attribution>
4750 You must attribute the work in the manner specified by the author or
4751 licensor (but not in any way that suggests that they endorse you or
4752 your use of the work).
4754 =item B<Share Alike>
4756 If you alter, transform, or build upon this work, you may distribute
4757 the resulting work only under the same, similar or a compatible
4758 license.
4760 =back
4762 With the understanding that:
4764 =over 9
4766 =item B<Waiver>
4768 Any of the above conditions can be waived if you get permission from
4769 the copyright holder.
4771 =item B<Public Domain>
4773 Where the work or any of its elements is in the public domain under
4774 applicable law, that status is in no way affected by the license.
4776 =item B<Other Rights>
4778 In no way are any of the following rights affected by the license:
4780 =over 2
4782 =item *
4784 Your fair dealing or fair use rights, or other applicable
4785 copyright exceptions and limitations;
4787 =item *
4789 The author's moral rights;
4791 =item *
4793 Rights other persons may have either in the work itself or in
4794 how the work is used, such as publicity or privacy rights.
4796 =back
4798 =back
4800 =over 9
4802 =item B<Notice>
4804 For any reuse or distribution, you must make clear to others the
4805 license terms of this work.
4807 =back
4809 A copy of the full license is included in the file as
4810 LICENCES/CC-BY-SA-4.0.txt
4813 =head1 DEPENDENCIES
4815 GNU B<parallel> uses Perl, and the Perl modules Getopt::Long,
4816 IPC::Open3, Symbol, IO::File, POSIX, and File::Temp. For remote usage
4817 it also uses rsync with ssh.
4820 =head1 SEE ALSO
4822 B<find>(1), B<xargs>(1), B<make>(1), B<pexec>(1), B<ppss>(1),
4823 B<xjobs>(1), B<prll>(1), B<dxargs>(1), B<mdm>(1)
4825 =cut