src/parallel_examples.pod

   1 #!/usr/bin/perl -w
   2
   3 # SPDX-FileCopyrightText: 2021-2024 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
   4 # SPDX-License-Identifier: GFDL-1.3-or-later
   5 # SPDX-License-Identifier: CC-BY-SA-4.0
   6
   7 =encoding utf8
   8
   9 =head1 GNU PARALLEL EXAMPLES
  10
  11 =head2 EXAMPLE: Working as xargs -n1. Argument appending
  12
  13 GNU B<parallel> can work similar to B<xargs -n1>.
  14
  15 To compress all html files using B<gzip> run:
  16
  17   find . -name '*.html' | parallel gzip --best
  18
  19 If the file names may contain a newline use B<-0>. Substitute FOO BAR with
  20 FUBAR in all files in this dir and subdirs:
  21
  22   find . -type f -print0 | \
  23     parallel -q0 perl -i -pe 's/FOO BAR/FUBAR/g'
  24
  25 Note B<-q> is needed because of the space in 'FOO BAR'.
  26
  27
  28 =head2 EXAMPLE: Simple network scanner
  29
  30 B<prips> can generate IP-addresses from CIDR notation. With GNU
  31 B<parallel> you can build a simple network scanner to see which
  32 addresses respond to B<ping>:
  33
  34   prips 130.229.16.0/20 | \
  35     parallel --timeout 2 -j0 \
  36       'ping -c 1 {} >/dev/null && echo {}' 2>/dev/null
  37
  38
  39 =head2 EXAMPLE: Reading arguments from command line
  40
  41 GNU B<parallel> can take the arguments from command line instead of
  42 stdin (standard input). To compress all html files in the current dir
  43 using B<gzip> run:
  44
  45   parallel gzip --best ::: *.html
  46
  47 To convert *.wav to *.mp3 using LAME running one process per CPU run:
  48
  49   parallel lame {} -o {.}.mp3 ::: *.wav
  50
  51
  52 =head2 EXAMPLE: Inserting multiple arguments
  53
  54 When moving a lot of files like this: B<mv *.log destdir> you will
  55 sometimes get the error:
  56
  57   bash: /bin/mv: Argument list too long
  58
  59 because there are too many files. You can instead do:
  60
  61   ls | grep -E '\.log$' | parallel mv {} destdir
  62
  63 This will run B<mv> for each file. It can be done faster if B<mv> gets
  64 as many arguments that will fit on the line:
  65
  66   ls | grep -E '\.log$' | parallel -m mv {} destdir
  67
  68 In many shells you can also use B<printf>:
  69
  70   printf '%s\0' *.log | parallel -0 -m mv {} destdir
  71
  72
  73 =head2 EXAMPLE: Context replace
  74
  75 To remove the files I<pict0000.jpg> .. I<pict9999.jpg> you could do:
  76
  77   seq -w 0 9999 | parallel rm pict{}.jpg
  78
  79 You could also do:
  80
  81   seq -w 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm
  82
  83 The first will run B<rm> 10000 times, while the last will only run
  84 B<rm> as many times needed to keep the command line length short
  85 enough to avoid B<Argument list too long> (it typically runs 1-2 times).
  86
  87 You could also run:
  88
  89   seq -w 0 9999 | parallel -X rm pict{}.jpg
  90
  91 This will also only run B<rm> as many times needed to keep the command
  92 line length short enough.
  93
  94
  95 =head2 EXAMPLE: Compute intensive jobs and substitution
  96
  97 If ImageMagick is installed this will generate a thumbnail of a jpg
  98 file:
  99
 100   convert -geometry 120 foo.jpg thumb_foo.jpg
 101
 102 This will run with number-of-cpus jobs in parallel for all jpg files
 103 in a directory:
 104
 105   ls *.jpg | parallel convert -geometry 120 {} thumb_{}
 106
 107 To do it recursively use B<find>:
 108
 109   find . -name '*.jpg' | \
 110     parallel convert -geometry 120 {} {}_thumb.jpg
 111
 112 Notice how the argument has to start with B<{}> as B<{}> will include path
 113 (e.g. running B<convert -geometry 120 ./foo/bar.jpg
 114 thumb_./foo/bar.jpg> would clearly be wrong). The command will
 115 generate files like ./foo/bar.jpg_thumb.jpg.
 116
 117 Use B<{.}> to avoid the extra .jpg in the file name. This command will
 118 make files like ./foo/bar_thumb.jpg:
 119
 120   find . -name '*.jpg' | \
 121     parallel convert -geometry 120 {} {.}_thumb.jpg
 122
 123
 124 =head2 EXAMPLE: Substitution and redirection
 125
 126 This will generate an uncompressed version of .gz-files next to the .gz-file:
 127
 128   parallel zcat {} ">"{.} ::: *.gz
 129
 130 Quoting of > is necessary to postpone the redirection. Another
 131 solution is to quote the whole command:
 132
 133   parallel "zcat {} >{.}" ::: *.gz
 134
 135 Other special shell characters (such as * ; $ > < |  >> <<) also need
 136 to be put in quotes, as they may otherwise be interpreted by the shell
 137 and not given to GNU B<parallel>.
 138
 139
 140 =head2 EXAMPLE: Composed commands
 141
 142 A job can consist of several commands. This will print the number of
 143 files in each directory:
 144
 145   ls | parallel 'echo -n {}" "; ls {}|wc -l'
 146
 147 To put the output in a file called <name>.dir:
 148
 149   ls | parallel '(echo -n {}" "; ls {}|wc -l) >{}.dir'
 150
 151 Even small shell scripts can be run by GNU B<parallel>:
 152
 153   find . | parallel 'a={}; name=${a##*/};' \
 154     'upper=$(echo "$name" | tr "[:lower:]" "[:upper:]");'\
 155     'echo "$name - $upper"'
 156
 157   ls | parallel 'mv {} "$(echo {} | tr "[:upper:]" "[:lower:]")"'
 158
 159 Given a list of URLs, list all URLs that fail to download. Print the
 160 line number and the URL.
 161
 162   cat urlfile | parallel "wget {} 2>/dev/null || grep -n {} urlfile"
 163
 164 Create a mirror directory with the same file names except all files and
 165 symlinks are empty files.
 166
 167   cp -rs /the/source/dir mirror_dir
 168   find mirror_dir -type l | parallel -m rm {} '&&' touch {}
 169
 170 Find the files in a list that do not exist
 171
 172   cat file_list | parallel 'if [ ! -e {} ] ; then echo {}; fi'
 173
 174
 175 =head2 EXAMPLE: Composed command with perl replacement string
 176
 177 You have a bunch of file. You want them sorted into dirs. The dir of
 178 each file should be named the first letter of the file name.
 179
 180   parallel 'mkdir -p {=s/(.).*/$1/=}; mv {} {=s/(.).*/$1/=}' ::: *
 181
 182
 183 =head2 EXAMPLE: Composed command with multiple input sources
 184
 185 You have a dir with files named as 24 hours in 5 minute intervals:
 186 00:00, 00:05, 00:10 .. 23:55. You want to find the files missing:
 187
 188   parallel [ -f {1}:{2} ] "||" echo {1}:{2} does not exist \
 189     ::: {00..23} ::: {00..55..5}
 190
 191
 192 =head2 EXAMPLE: Calling Bash functions
 193
 194 If the composed command is longer than a line, it becomes hard to
 195 read. In Bash you can use functions. Just remember to B<export -f> the
 196 function.
 197
 198   doit() {
 199     echo Doing it for $1
 200     sleep 2
 201     echo Done with $1
 202   }
 203   export -f doit
 204   parallel doit ::: 1 2 3
 205
 206   doubleit() {
 207     echo Doing it for $1 $2
 208     sleep 2
 209     echo Done with $1 $2
 210   }
 211   export -f doubleit
 212   parallel doubleit ::: 1 2 3 ::: a b
 213
 214 To do this on remote servers you need to transfer the function using
 215 B<--env>:
 216
 217   parallel --env doit -S server doit ::: 1 2 3
 218   parallel --env doubleit -S server doubleit ::: 1 2 3 ::: a b
 219
 220 If your environment (aliases, variables, and functions) is small you
 221 can copy the full environment without having to
 222 B<export -f> anything. See B<env_parallel>.
 223
 224
 225 =head2 EXAMPLE: Function tester
 226
 227 To test a program with different parameters:
 228
 229   tester() {
 230     if (eval "$@") >&/dev/null; then
 231       perl -e 'printf "\033[30;102m[ OK ]\033[0m @ARGV\n"' "$@"
 232     else
 233       perl -e 'printf "\033[30;101m[FAIL]\033[0m @ARGV\n"' "$@"
 234     fi
 235   }
 236   export -f tester
 237   parallel tester my_program ::: arg1 arg2
 238   parallel tester exit ::: 1 0 2 0
 239
 240 If B<my_program> fails a red FAIL will be printed followed by the failing
 241 command; otherwise a green OK will be printed followed by the command.
 242
 243
 244 =head2 EXAMPLE: Identify few failing jobs
 245
 246 B<--bar> works best if jobs have no output. If the failing jobs have
 247 output you can identify the jobs like this:
 248
 249   job-with-few-failures() {
 250       # Force reproducibility
 251       RANDOM=$1
 252       # This fails 1% (328 of 32768)
 253       if [ $RANDOM -lt 328 ] ; then
 254         echo Failed $1
 255       fi
 256   }
 257   export -f job-with-few-failures
 258   seq 1000 | parallel --bar --tag job-with-few-failures
 259
 260
 261 =head2 EXAMPLE: Continously show the latest line of output
 262
 263 It can be useful to monitor the output of running jobs.
 264
 265 This shows the most recent output line until a job finishes. After
 266 which the output of the job is printed in full:
 267
 268   parallel '{} | tee >(cat >&3)' ::: 'command 1' 'command 2' \
 269     3> >(perl -ne '$|=1;chomp;printf"%.'$COLUMNS's\r",$_." "x100')
 270
 271
 272 =head2 EXAMPLE: Log rotate
 273
 274 Log rotation renames a logfile to an extension with a higher number:
 275 log.1 becomes log.2, log.2 becomes log.3, and so on. The oldest log is
 276 removed. To avoid overwriting files the process starts backwards from
 277 the high number to the low number.  This will keep 10 old versions of
 278 the log:
 279
 280   seq 9 -1 1 | parallel -j1 mv log.{} log.'{= $_++ =}'
 281   mv log log.1
 282
 283
 284 =head2 EXAMPLE: Removing file extension when processing files
 285
 286 When processing files removing the file extension using B<{.}> is
 287 often useful.
 288
 289 Create a directory for each zip-file and unzip it in that dir:
 290
 291   parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
 292
 293 Recompress all .gz files in current directory using B<bzip2> running 1
 294 job per CPU in parallel:
 295
 296   parallel "zcat {} | bzip2 >{.}.bz2 && rm {}" ::: *.gz
 297
 298 Convert all WAV files to MP3 using LAME:
 299
 300   find sounddir -type f -name '*.wav' | parallel lame {} -o {.}.mp3
 301
 302 Put all converted in the same directory:
 303
 304   find sounddir -type f -name '*.wav' | \
 305     parallel lame {} -o mydir/{/.}.mp3
 306
 307
 308 =head2 EXAMPLE: Replacing parts of file names
 309
 310 If you deal with paired end reads, you will have files like
 311 barcode1_R1.fq.gz, barcode1_R2.fq.gz, barcode2_R1.fq.gz, and
 312 barcode2_R2.fq.gz.
 313
 314 You want barcodeI<N>_R1 to be processed with barcodeI<N>_R2.
 315
 316     parallel --plus myprocess {} {/_R1.fq.gz/_R2.fq.gz} ::: *_R1.fq.gz
 317
 318 If the barcode does not contain '_R1', you can do:
 319
 320     parallel --plus myprocess {} {/_R1/_R2} ::: *_R1.fq.gz
 321
 322
 323 =head2 EXAMPLE: Removing strings from the argument
 324
 325 If you have directory with tar.gz files and want these extracted in
 326 the corresponding dir (e.g foo.tar.gz will be extracted in the dir
 327 foo) you can do:
 328
 329   parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
 330
 331 If you want to remove a different ending, you can use {%string}:
 332
 333   parallel --plus echo {%_demo} ::: mycode_demo keep_demo_here
 334
 335 You can also remove a starting string with {#string}
 336
 337   parallel --plus echo {#demo_} ::: demo_mycode keep_demo_here
 338
 339 To remove a string anywhere you can use regular expressions with
 340 {/regexp/replacement} and leave the replacement empty:
 341
 342   parallel --plus echo {/demo_/} ::: demo_mycode remove_demo_here
 343
 344
 345 =head2 EXAMPLE: Download 24 images for each of the past 30 days
 346
 347 Let us assume a website stores images like:
 348
 349   https://www.example.com/path/to/YYYYMMDD_##.jpg
 350
 351 where YYYYMMDD is the date and ## is the number 01-24. This will
 352 download images for the past 30 days:
 353
 354   getit() {
 355     date=$(date -d "today -$1 days" +%Y%m%d)
 356     num=$2
 357     echo wget https://www.example.com/path/to/${date}_${num}.jpg
 358   }
 359   export -f getit
 360
 361   parallel getit ::: $(seq 30) ::: $(seq -w 24)
 362
 363 B<$(date -d "today -$1 days" +%Y%m%d)> will give the dates in
 364 YYYYMMDD with B<$1> days subtracted.
 365
 366
 367 =head2 EXAMPLE: Download world map from NASA
 368
 369 NASA provides tiles to download on earthdata.nasa.gov. Download tiles
 370 for Blue Marble world map and create a 10240x20480 map.
 371
 372   base=https://map1a.vis.earthdata.nasa.gov/wmts-geo/wmts.cgi
 373   service="SERVICE=WMTS&REQUEST=GetTile&VERSION=1.0.0"
 374   layer="LAYER=BlueMarble_ShadedRelief_Bathymetry"
 375   set="STYLE=&TILEMATRIXSET=EPSG4326_500m&TILEMATRIX=5"
 376   tile="TILEROW={1}&TILECOL={2}"
 377   format="FORMAT=image%2Fjpeg"
 378   url="$base?$service&$layer&$set&$tile&$format"
 379
 380   parallel -j0 -q wget "$url" -O {1}_{2}.jpg ::: {0..19} ::: {0..39}
 381   parallel eval convert +append {}_{0..39}.jpg line{}.jpg ::: {0..19}
 382   convert -append line{0..19}.jpg world.jpg
 383
 384
 385 =head2 EXAMPLE: Download Apollo-11 images from NASA using jq
 386
 387 Search NASA using their API to get JSON for images related to 'apollo
 388 11' and has 'moon landing' in the description.
 389
 390 The search query returns JSON containing URLs to JSON containing
 391 collections of pictures. One of the pictures in each of these
 392 collection is I<large>.
 393
 394 B<wget> is used to get the JSON for the search query. B<jq> is then
 395 used to extract the URLs of the collections. B<parallel> then calls
 396 B<wget> to get each collection, which is passed to B<jq> to extract
 397 the URLs of all images. B<grep> filters out the I<large> images, and
 398 B<parallel> finally uses B<wget> to fetch the images.
 399
 400   base="https://images-api.nasa.gov/search"
 401   q="q=apollo 11"
 402   description="description=moon landing"
 403   media_type="media_type=image"
 404   wget -O - "$base?$q&$description&$media_type" |
 405     jq -r .collection.items[].href |
 406     parallel wget -O - |
 407     jq -r .[] |
 408     grep large |
 409     parallel wget
 410
 411
 412 =head2 EXAMPLE: Download video playlist in parallel
 413
 414 B<youtube-dl> is an excellent tool to download videos. It can,
 415 however, not download videos in parallel. This takes a playlist and
 416 downloads 10 videos in parallel.
 417
 418   url='youtu.be/watch?v=0wOf2Fgi3DE&list=UU_cznB5YZZmvAmeq7Y3EriQ'
 419   export url
 420   youtube-dl --flat-playlist "https://$url" |
 421     parallel --tagstring {#} --lb -j10 \
 422       youtube-dl --playlist-start {#} --playlist-end {#} '"https://$url"'
 423
 424
 425 =head2 EXAMPLE: Prepend last modified date (ISO8601) to file name
 426
 427   parallel mv {} '{= $a=pQ($_); $b=$_;' \
 428     '$_=qx{date -r "$a" +%FT%T}; chomp; $_="$_ $b" =}' ::: *
 429
 430 B<{=> and B<=}> mark a perl expression. B<pQ> perl-quotes the
 431 string. B<date +%FT%T> is the date in ISO8601 with time.
 432
 433 =head2 EXAMPLE: Save output in ISO8601 dirs
 434
 435 Save output from B<ps aux> every second into dirs named
 436 yyyy-mm-ddThh:mm:ss+zz:zz.
 437
 438   seq 1000 | parallel -N0 -j1 --delay 1 \
 439     --results '{= $_=`date -Isec`; chomp=}/' ps aux
 440
 441
 442 =head2 EXAMPLE: Digital clock with "blinking" :
 443
 444 The : in a digital clock blinks. To make every other line have a ':'
 445 and the rest a ' ' a perl expression is used to look at the 3rd input
 446 source. If the value modulo 2 is 1: Use ":" otherwise use " ":
 447
 448   parallel -k echo {1}'{=3 $_=$_%2?":":" "=}'{2}{3} \
 449     ::: {0..12} ::: {0..5} ::: {0..9}
 450
 451
 452 =head2 EXAMPLE: Aggregating content of files
 453
 454 This:
 455
 456   parallel --header : echo x{X}y{Y}z{Z} \> x{X}y{Y}z{Z} \
 457   ::: X {1..5} ::: Y {01..10} ::: Z {1..5}
 458
 459 will generate the files x1y01z1 .. x5y10z5. If you want to aggregate
 460 the output grouping on x and z you can do this:
 461
 462   parallel eval 'cat {=s/y01/y*/=} > {=s/y01//=}' ::: *y01*
 463
 464 For all values of x and z it runs commands like:
 465
 466   cat x1y*z1 > x1z1
 467
 468 So you end up with x1z1 .. x5z5 each containing the content of all
 469 values of y.
 470
 471
 472 =head2 EXAMPLE: Breadth first parallel web crawler/mirrorer
 473
 474 This script below will crawl and mirror a URL in parallel.  It
 475 downloads first pages that are 1 click down, then 2 clicks down, then
 476 3; instead of the normal depth first, where the first link link on
 477 each page is fetched first.
 478
 479 Run like this:
 480
 481   PARALLEL=-j100 ./parallel-crawl http://gatt.org.yeslab.org/
 482
 483 Remove the B<wget> part if you only want a web crawler.
 484
 485 It works by fetching a page from a list of URLs and looking for links
 486 in that page that are within the same starting URL and that have not
 487 already been seen. These links are added to a new queue. When all the
 488 pages from the list is done, the new queue is moved to the list of
 489 URLs and the process is started over until no unseen links are found.
 490
 491   #!/bin/bash
 492
 493   # E.g. http://gatt.org.yeslab.org/
 494   URL=$1
 495   # Stay inside the start dir
 496   BASEURL=$(echo $URL | perl -pe 's:#.*::; s:(//.*/)[^/]*:$1:')
 497   URLLIST=$(mktemp urllist.XXXX)
 498   URLLIST2=$(mktemp urllist.XXXX)
 499   SEEN=$(mktemp seen.XXXX)
 500
 501   # Spider to get the URLs
 502   echo $URL >$URLLIST
 503   cp $URLLIST $SEEN
 504
 505   while [ -s $URLLIST ] ; do
 506     cat $URLLIST |
 507       parallel lynx -listonly -image_links -dump {} \; \
 508         wget -qm -l1 -Q1 {} \; echo Spidered: {} \>\&2 |
 509         perl -ne 's/#.*//; s/\s+\d+.\s(\S+)$/$1/ and
 510           do { $seen{$1}++ or print }' |
 511       grep -F $BASEURL |
 512       grep -v -x -F -f $SEEN | tee -a $SEEN > $URLLIST2
 513     mv $URLLIST2 $URLLIST
 514   done
 515
 516   rm -f $URLLIST $URLLIST2 $SEEN
 517
 518
 519 =head2 EXAMPLE: Process files from a tar file while unpacking
 520
 521 If the files to be processed are in a tar file then unpacking one file
 522 and processing it immediately may be faster than first unpacking all
 523 files.
 524
 525   tar xvf foo.tgz | perl -ne 'print $l;$l=$_;END{print $l}' | \
 526     parallel echo
 527
 528 The Perl one-liner is needed to make sure the file is complete before
 529 handing it to GNU B<parallel>.
 530
 531
 532 =head2 EXAMPLE: Rewriting a for-loop and a while-read-loop
 533
 534 for-loops like this:
 535
 536   (for x in `cat list` ; do
 537     do_something $x
 538   done) | process_output
 539
 540 and while-read-loops like this:
 541
 542   cat list | (while read x ; do
 543     do_something $x
 544   done) | process_output
 545
 546 can be written like this:
 547
 548   cat list | parallel do_something | process_output
 549
 550 For example: Find which host name in a list has IP address 1.2.3 4:
 551
 552   cat hosts.txt | parallel -P 100 host | grep 1.2.3.4
 553
 554 If the processing requires more steps the for-loop like this:
 555
 556   (for x in `cat list` ; do
 557     no_extension=${x%.*};
 558     do_step1 $x scale $no_extension.jpg
 559     do_step2 <$x $no_extension
 560   done) | process_output
 561
 562 and while-loops like this:
 563
 564   cat list | (while read x ; do
 565     no_extension=${x%.*};
 566     do_step1 $x scale $no_extension.jpg
 567     do_step2 <$x $no_extension
 568   done) | process_output
 569
 570 can be written like this:
 571
 572   cat list | parallel "do_step1 {} scale {.}.jpg ; do_step2 <{} {.}" |\
 573     process_output
 574
 575 If the body of the loop is bigger, it improves readability to use a function:
 576
 577   (for x in `cat list` ; do
 578     do_something $x
 579     [... 100 lines that do something with $x ...]
 580   done) | process_output
 581
 582   cat list | (while read x ; do
 583     do_something $x
 584     [... 100 lines that do something with $x ...]
 585   done) | process_output
 586
 587 can both be rewritten as:
 588
 589   doit() {
 590     x=$1
 591     do_something $x
 592     [... 100 lines that do something with $x ...]
 593   }
 594   export -f doit
 595   cat list | parallel doit
 596
 597 =head2 EXAMPLE: Rewriting nested for-loops
 598
 599 Nested for-loops like this:
 600
 601   (for x in `cat xlist` ; do
 602     for y in `cat ylist` ; do
 603       do_something $x $y
 604     done
 605   done) | process_output
 606
 607 can be written like this:
 608
 609   parallel do_something {1} {2} :::: xlist ylist | process_output
 610
 611 Nested for-loops like this:
 612
 613   (for colour in red green blue ; do
 614     for size in S M L XL XXL ; do
 615       echo $colour $size
 616     done
 617   done) | sort
 618
 619 can be written like this:
 620
 621   parallel echo {1} {2} ::: red green blue ::: S M L XL XXL | sort
 622
 623
 624 =head2 EXAMPLE: Finding the lowest difference between files
 625
 626 B<diff> is good for finding differences in text files. B<diff | wc -l>
 627 gives an indication of the size of the difference. To find the
 628 differences between all files in the current dir do:
 629
 630   parallel --tag 'diff {1} {2} | wc -l' ::: * ::: * | sort -nk3
 631
 632 This way it is possible to see if some files are closer to other
 633 files.
 634
 635
 636 =head2 EXAMPLE: for-loops with column names
 637
 638 When doing multiple nested for-loops it can be easier to keep track of
 639 the loop variable if is is named instead of just having a number. Use
 640 B<--header :> to let the first argument be an named alias for the
 641 positional replacement string:
 642
 643   parallel --header : echo {colour} {size} \
 644     ::: colour red green blue ::: size S M L XL XXL
 645
 646 This also works if the input file is a file with columns:
 647
 648   cat addressbook.tsv | \
 649     parallel --colsep '\t' --header : echo {Name} {E-mail address}
 650
 651
 652 =head2 EXAMPLE: All combinations in a list
 653
 654 GNU B<parallel> makes all combinations when given two lists.
 655
 656 To make all combinations in a single list with unique values, you
 657 repeat the list and use replacement string B<{choose_k}>:
 658
 659   parallel --plus echo {choose_k} ::: A B C D ::: A B C D
 660
 661   parallel --plus echo 2{2choose_k} 1{1choose_k} ::: A B C D ::: A B C D
 662
 663 B<{choose_k}> works for any number of input sources:
 664
 665   parallel --plus echo {choose_k} ::: A B C D ::: A B C D ::: A B C D
 666
 667 Where B<{choose_k}> does not care about order, B<{uniq}> cares about
 668 order. It simply skips jobs where values from different input sources
 669 are the same:
 670
 671   parallel --plus echo {uniq} ::: A B C  ::: A B C  ::: A B C
 672   parallel --plus echo {1uniq}+{2uniq}+{3uniq} \
 673     ::: A B C  ::: A B C  ::: A B C
 674
 675 The behaviour of B<{choose_k}> is undefined, if the input values of each
 676 source are different.
 677
 678
 679 =head2 EXAMPLE: From a to b and b to c
 680
 681 Assume you have input like:
 682
 683   aardvark
 684   babble
 685   cab
 686   dab
 687   each
 688
 689 and want to run combinations like:
 690
 691   aardvark babble
 692   babble cab
 693   cab dab
 694   dab each
 695
 696 If the input is in the file in.txt:
 697
 698   parallel echo {1} - {2} ::::+ <(head -n -1 in.txt) <(tail -n +2 in.txt)
 699
 700 If the input is in the array $a here are two solutions:
 701
 702   seq $((${#a[@]}-1)) | \
 703     env_parallel --env a echo '${a[{=$_--=}]} - ${a[{}]}'
 704   parallel echo {1} - {2} ::: "${a[@]::${#a[@]}-1}" :::+ "${a[@]:1}"
 705
 706
 707 =head2 EXAMPLE: Count the differences between all files in a dir
 708
 709 Using B<--results> the results are saved in /tmp/diffcount*.
 710
 711   parallel --results /tmp/diffcount "diff -U 0 {1} {2} | \
 712     tail -n +3 |grep -v '^@'|wc -l" ::: * ::: *
 713
 714 To see the difference between file A and file B look at the file
 715 '/tmp/diffcount/1/A/2/B'.
 716
 717
 718 =head2 EXAMPLE: Speeding up fast jobs
 719
 720 Starting a job on the local machine takes around 3-10 ms. This can be
 721 a big overhead if the job takes very few ms to run. Often you can
 722 group small jobs together using B<-X> which will make the overhead
 723 less significant. Compare the speed of these:
 724
 725   seq -w 0 9999 | parallel touch pict{}.jpg
 726   seq -w 0 9999 | parallel -X touch pict{}.jpg
 727
 728 If your program cannot take multiple arguments, then you can use GNU
 729 B<parallel> to spawn multiple GNU B<parallel>s:
 730
 731   seq -w 0 9999999 | \
 732     parallel -j10 -q -I,, --pipe parallel -j0 touch pict{}.jpg
 733
 734 If B<-j0> normally spawns 252 jobs, then the above will try to spawn
 735 2520 jobs. On a normal GNU/Linux system you can spawn 32000 jobs using
 736 this technique with no problems. To raise the 32000 jobs limit raise
 737 /proc/sys/kernel/pid_max to 4194303.
 738
 739 If you do not need GNU B<parallel> to have control over each job (so
 740 no need for B<--retries> or B<--joblog> or similar), then it can be
 741 even faster if you can generate the command lines and pipe those to a
 742 shell. So if you can do this:
 743
 744   mygenerator | sh
 745
 746 Then that can be parallelized like this:
 747
 748   mygenerator | parallel --pipe --block 10M sh
 749
 750 E.g.
 751
 752   mygenerator() {
 753     seq 10000000 | perl -pe 'print "echo This is fast job number "';
 754   }
 755   mygenerator | parallel --pipe --block 10M sh
 756
 757 The overhead is 100000 times smaller namely around 100 nanoseconds per
 758 job.
 759
 760
 761 =head2 EXAMPLE: Using shell variables
 762
 763 When using shell variables you need to quote them correctly as they
 764 may otherwise be interpreted by the shell.
 765
 766 Notice the difference between:
 767
 768   ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
 769   parallel echo ::: ${ARR[@]} # This is probably not what you want
 770
 771 and:
 772
 773   ARR=("My brother's 12\" records are worth <\$\$\$>"'!' Foo Bar)
 774   parallel echo ::: "${ARR[@]}"
 775
 776 When using variables in the actual command that contains special
 777 characters (e.g. space) you can quote them using B<'"$VAR"'> or using
 778 "'s and B<-q>:
 779
 780   VAR="My brother's 12\" records are worth <\$\$\$>"
 781   parallel -q echo "$VAR" ::: '!'
 782   export VAR
 783   parallel echo '"$VAR"' ::: '!'
 784
 785 If B<$VAR> does not contain ' then B<"'$VAR'"> will also work
 786 (and does not need B<export>):
 787
 788   VAR="My 12\" records are worth <\$\$\$>"
 789   parallel echo "'$VAR'" ::: '!'
 790
 791 If you use them in a function you just quote as you normally would do:
 792
 793   VAR="My brother's 12\" records are worth <\$\$\$>"
 794   export VAR
 795   myfunc() { echo "$VAR" "$1"; }
 796   export -f myfunc
 797   parallel myfunc ::: '!'
 798
 799
 800 =head2 EXAMPLE: Group output lines
 801
 802 When running jobs that output data, you often do not want the output
 803 of multiple jobs to run together. GNU B<parallel> defaults to grouping
 804 the output of each job, so the output is printed when the job
 805 finishes. If you want full lines to be printed while the job is
 806 running you can use B<--line-buffer>. If you want output to be
 807 printed as soon as possible you can use B<-u>.
 808
 809 Compare the output of:
 810
 811   parallel wget --progress=dot --limit-rate=100k \
 812     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 813     ::: {12..16}
 814   parallel --line-buffer wget --progress=dot --limit-rate=100k \
 815     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 816     ::: {12..16}
 817   parallel --latest-line wget --progress=dot --limit-rate=100k \
 818     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 819     ::: {12..16}
 820   parallel -u wget --progress=dot --limit-rate=100k \
 821     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 822     ::: {12..16}
 823
 824 =head2 EXAMPLE: Tag output lines
 825
 826 GNU B<parallel> groups the output lines, but it can be hard to see
 827 where the different jobs begin. B<--tag> prepends the argument to make
 828 that more visible:
 829
 830   parallel --tag wget --limit-rate=100k \
 831     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 832     ::: {12..16}
 833
 834 B<--tag> works with B<--line-buffer> but not with B<-u>:
 835
 836   parallel --tag --line-buffer wget --limit-rate=100k \
 837     https://ftpmirror.gnu.org/parallel/parallel-20{}0822.tar.bz2 \
 838     ::: {12..16}
 839
 840 Check the uptime of the servers in I<~/.parallel/sshloginfile>:
 841
 842   parallel --tag -S .. --nonall uptime
 843
 844
 845 =head2 EXAMPLE: Colorize output
 846
 847 Give each job a new color. Most terminals support ANSI colors with the
 848 escape code "\033[30;3Xm" where 0 <= X <= 7:
 849
 850     seq 10 | \
 851       parallel --tagstring '\033[30;3{=$_=++$::color%8=}m' seq {}
 852     parallel --rpl '{color} $_="\033[30;3".(++$::color%8)."m"' \
 853       --tagstring {color} seq {} ::: {1..10}
 854
 855 To get rid of the initial \t (which comes from B<--tagstring>):
 856
 857     ... | perl -pe 's/\t//'
 858
 859
 860 =head2 EXAMPLE: Keep order of output same as order of input
 861
 862 Normally the output of a job will be printed as soon as it
 863 completes. Sometimes you want the order of the output to remain the
 864 same as the order of the input. This is often important, if the output
 865 is used as input for another system. B<-k> will make sure the order of
 866 output will be in the same order as input even if later jobs end
 867 before earlier jobs.
 868
 869 Append a string to every line in a text file:
 870
 871   cat textfile | parallel -k echo {} append_string
 872
 873 If you remove B<-k> some of the lines may come out in the wrong order.
 874
 875 Another example is B<traceroute>:
 876
 877   parallel traceroute ::: qubes-os.org debian.org freenetproject.org
 878
 879 will give traceroute of qubes-os.org, debian.org and
 880 freenetproject.org, but it will be sorted according to which job
 881 completed first.
 882
 883 To keep the order the same as input run:
 884
 885   parallel -k traceroute ::: qubes-os.org debian.org freenetproject.org
 886
 887 This will make sure the traceroute to qubes-os.org will be printed
 888 first.
 889
 890 A bit more complex example is downloading a huge file in chunks in
 891 parallel: Some internet connections will deliver more data if you
 892 download files in parallel. For downloading files in parallel see:
 893 "EXAMPLE: Download 10 images for each of the past 30 days". But if you
 894 are downloading a big file you can download the file in chunks in
 895 parallel.
 896
 897 To download byte 10000000-19999999 you can use B<curl>:
 898
 899   curl -r 10000000-19999999 https://example.com/the/big/file >file.part
 900
 901 To download a 1 GB file we need 100 10MB chunks downloaded and
 902 combined in the correct order.
 903
 904   seq 0 99 | parallel -k curl -r \
 905     {}0000000-{}9999999 https://example.com/the/big/file > file
 906
 907
 908 =head2 EXAMPLE: Parallel grep
 909
 910 B<grep -r> greps recursively through directories. GNU B<parallel> can
 911 often speed this up.
 912
 913   find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
 914
 915 This will run 1.5 job per CPU, and give 1000 arguments to B<grep>.
 916
 917 There are situations where the above will be slower than B<grep -r>:
 918
 919 =over 2
 920
 921 =item *
 922
 923 If data is already in RAM. The overhead of starting jobs and buffering
 924 output may outweigh the benefit of running in parallel.
 925
 926 =item *
 927
 928 If the files are big. If a file cannot be read in a single seek, the
 929 disk may start thrashing.
 930
 931 =back
 932
 933 The speedup is caused by two factors:
 934
 935 =over 2
 936
 937 =item *
 938
 939 On rotating harddisks small files often require a seek for each
 940 file. By searching for more files in parallel, the arm may pass
 941 another wanted file on its way.
 942
 943 =item *
 944
 945 NVMe drives often perform better by having multiple command running in
 946 parallel.
 947
 948 =back
 949
 950
 951 =head2 EXAMPLE: Grepping n lines for m regular expressions.
 952
 953 The simplest solution to grep a big file for a lot of regexps is:
 954
 955   grep -f regexps.txt bigfile
 956
 957 Or if the regexps are fixed strings:
 958
 959   grep -F -f regexps.txt bigfile
 960
 961 There are 3 limiting factors: CPU, RAM, and disk I/O.
 962
 963 RAM is easy to measure: If the B<grep> process takes up most of your
 964 free memory (e.g. when running B<top>), then RAM is a limiting factor.
 965
 966 CPU is also easy to measure: If the B<grep> takes >90% CPU in B<top>,
 967 then the CPU is a limiting factor, and parallelization will speed this
 968 up.
 969
 970 It is harder to see if disk I/O is the limiting factor, and depending
 971 on the disk system it may be faster or slower to parallelize. The only
 972 way to know for certain is to test and measure.
 973
 974
 975 =head3 Limiting factor: RAM
 976
 977 The normal B<grep -f regexps.txt bigfile> works no matter the size of
 978 bigfile, but if regexps.txt is so big it cannot fit into memory, then
 979 you need to split this.
 980
 981 B<grep -F> takes around 100 bytes of RAM and B<grep> takes about 500
 982 bytes of RAM per 1 byte of regexp. So if regexps.txt is 1% of your
 983 RAM, then it may be too big.
 984
 985 If you can convert your regexps into fixed strings do that. E.g. if
 986 the lines you are looking for in bigfile all looks like:
 987
 988   ID1 foo bar baz Identifier1 quux
 989   fubar ID2 foo bar baz Identifier2
 990
 991 then your regexps.txt can be converted from:
 992
 993   ID1.*Identifier1
 994   ID2.*Identifier2
 995
 996 into:
 997
 998   ID1 foo bar baz Identifier1
 999   ID2 foo bar baz Identifier2
1000
1001 This way you can use B<grep -F> which takes around 80% less memory and
1002 is much faster.
1003
1004 If it still does not fit in memory you can do this:
1005
1006   parallel --pipe-part -a regexps.txt --block 1M grep -F -f - -n bigfile | \
1007     sort -un | perl -pe 's/^\d+://'
1008
1009 The 1M should be your free memory divided by the number of CPU threads and
1010 divided by 200 for B<grep -F> and by 1000 for normal B<grep>. On
1011 GNU/Linux you can do:
1012
1013   free=$(awk '/^((Swap)?Cached|MemFree|Buffers):/ { sum += $2 }
1014               END { print sum }' /proc/meminfo)
1015   percpu=$((free / 200 / $(parallel --number-of-threads)))k
1016
1017   parallel --pipe-part -a regexps.txt --block $percpu --compress \
1018     grep -F -f - -n bigfile | \
1019     sort -un | perl -pe 's/^\d+://'
1020
1021 If you can live with duplicated lines and wrong order, it is faster to do:
1022
1023   parallel --pipe-part -a regexps.txt --block $percpu --compress \
1024     grep -F -f - bigfile
1025
1026 =head3 Limiting factor: CPU
1027
1028 If the CPU is the limiting factor parallelization should be done on
1029 the regexps:
1030
1031   cat regexps.txt | parallel --pipe -L1000 --round-robin --compress \
1032     grep -f - -n bigfile | \
1033     sort -un | perl -pe 's/^\d+://'
1034
1035 The command will start one B<grep> per CPU and read I<bigfile> one
1036 time per CPU, but as that is done in parallel, all reads except the
1037 first will be cached in RAM. Depending on the size of I<regexps.txt> it
1038 may be faster to use B<--block 10m> instead of B<-L1000>.
1039
1040 Some storage systems perform better when reading multiple chunks in
1041 parallel. This is true for some RAID systems and for some network file
1042 systems. To parallelize the reading of I<bigfile>:
1043
1044   parallel --pipe-part --block 100M -a bigfile -k --compress \
1045     grep -f regexps.txt
1046
1047 This will split I<bigfile> into 100MB chunks and run B<grep> on each of
1048 these chunks. To parallelize both reading of I<bigfile> and I<regexps.txt>
1049 combine the two using B<--cat>:
1050
1051   parallel --pipe-part --block 100M -a bigfile --cat cat regexps.txt \
1052     \| parallel --pipe -L1000 --round-robin grep -f - {}
1053
1054 If a line matches multiple regexps, the line may be duplicated.
1055
1056 =head3 Bigger problem
1057
1058 If the problem is too big to be solved by this, you are probably ready
1059 for Lucene.
1060
1061
1062 =head2 EXAMPLE: Using remote computers
1063
1064 To run commands on a remote computer SSH needs to be set up and you
1065 must be able to login without entering a password (The commands
1066 B<ssh-copy-id>, B<ssh-agent>, and B<sshpass> may help you do that).
1067
1068 If you need to login to a whole cluster, you typically do not want to
1069 accept the host key for every host. You want to accept them the first
1070 time and be warned if they are ever changed. To do that:
1071
1072   # Add the servers to the sshloginfile
1073   (echo servera; echo serverb) > .parallel/my_cluster
1074   # Make sure .ssh/config exist
1075   touch .ssh/config
1076   cp .ssh/config .ssh/config.backup
1077   # Disable StrictHostKeyChecking temporarily
1078   (echo 'Host *'; echo StrictHostKeyChecking no) >> .ssh/config
1079   parallel --slf my_cluster --nonall true
1080   # Remove the disabling of StrictHostKeyChecking
1081   mv .ssh/config.backup .ssh/config
1082
1083 The servers in B<.parallel/my_cluster> are now added in B<.ssh/known_hosts>.
1084
1085 To run B<echo> on B<server.example.com>:
1086
1087   seq 10 | parallel --sshlogin server.example.com echo
1088
1089 To run commands on more than one remote computer run:
1090
1091   seq 10 | parallel --sshlogin s1.example.com,s2.example.net echo
1092
1093 Or:
1094
1095   seq 10 | parallel --sshlogin server.example.com \
1096     --sshlogin server2.example.net echo
1097
1098 If the login username is I<foo> on I<server2.example.net> use:
1099
1100   seq 10 | parallel --sshlogin server.example.com \
1101     --sshlogin foo@server2.example.net echo
1102
1103 If your list of hosts is I<server1-88.example.net> with login I<foo>:
1104
1105   seq 10 | parallel -Sfoo@server{1..88}.example.net echo
1106
1107 To distribute the commands to a list of computers, make a file
1108 I<mycomputers> with all the computers:
1109
1110   server.example.com
1111   foo@server2.example.com
1112   server3.example.com
1113
1114 Then run:
1115
1116   seq 10 | parallel --sshloginfile mycomputers echo
1117
1118 To include the local computer add the special sshlogin ':' to the list:
1119
1120   server.example.com
1121   foo@server2.example.com
1122   server3.example.com
1123   :
1124
1125 GNU B<parallel> will try to determine the number of CPUs on each of
1126 the remote computers, and run one job per CPU - even if the remote
1127 computers do not have the same number of CPUs.
1128
1129 If the number of CPUs on the remote computers is not identified
1130 correctly the number of CPUs can be added in front. Here the computer
1131 has 8 CPUs.
1132
1133   seq 10 | parallel --sshlogin 8/server.example.com echo
1134
1135
1136 =head2 EXAMPLE: Transferring of files
1137
1138 To recompress gzipped files with B<bzip2> using a remote computer run:
1139
1140   find logs/ -name '*.gz' | \
1141     parallel --sshlogin server.example.com \
1142     --transfer "zcat {} | bzip2 -9 >{.}.bz2"
1143
1144 This will list the .gz-files in the I<logs> directory and all
1145 directories below. Then it will transfer the files to
1146 I<server.example.com> to the corresponding directory in
1147 I<$HOME/logs>. On I<server.example.com> the file will be recompressed
1148 using B<zcat> and B<bzip2> resulting in the corresponding file with
1149 I<.gz> replaced with I<.bz2>.
1150
1151 If you want the resulting bz2-file to be transferred back to the local
1152 computer add I<--return {.}.bz2>:
1153
1154   find logs/ -name '*.gz' | \
1155     parallel --sshlogin server.example.com \
1156     --transfer --return {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
1157
1158 After the recompressing is done the I<.bz2>-file is transferred back to
1159 the local computer and put next to the original I<.gz>-file.
1160
1161 If you want to delete the transferred files on the remote computer add
1162 I<--cleanup>. This will remove both the file transferred to the remote
1163 computer and the files transferred from the remote computer:
1164
1165   find logs/ -name '*.gz' | \
1166     parallel --sshlogin server.example.com \
1167     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
1168
1169 If you want run on several computers add the computers to I<--sshlogin>
1170 either using ',' or multiple I<--sshlogin>:
1171
1172   find logs/ -name '*.gz' | \
1173     parallel --sshlogin server.example.com,server2.example.com \
1174     --sshlogin server3.example.com \
1175     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
1176
1177 You can add the local computer using I<--sshlogin :>. This will disable the
1178 removing and transferring for the local computer only:
1179
1180   find logs/ -name '*.gz' | \
1181     parallel --sshlogin server.example.com,server2.example.com \
1182     --sshlogin server3.example.com \
1183     --sshlogin : \
1184     --transfer --return {.}.bz2 --cleanup "zcat {} | bzip2 -9 >{.}.bz2"
1185
1186 Often I<--transfer>, I<--return> and I<--cleanup> are used together. They can be
1187 shortened to I<--trc>:
1188
1189   find logs/ -name '*.gz' | \
1190     parallel --sshlogin server.example.com,server2.example.com \
1191     --sshlogin server3.example.com \
1192     --sshlogin : \
1193     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
1194
1195 With the file I<mycomputers> containing the list of computers it becomes:
1196
1197   find logs/ -name '*.gz' | parallel --sshloginfile mycomputers \
1198     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
1199
1200 If the file I<~/.parallel/sshloginfile> contains the list of computers
1201 the special short hand I<-S ..> can be used:
1202
1203   find logs/ -name '*.gz' | parallel -S .. \
1204     --trc {.}.bz2 "zcat {} | bzip2 -9 >{.}.bz2"
1205
1206
1207 =head2 EXAMPLE: Advanced file transfer
1208
1209 Assume you have files in in/*, want them processed on server,
1210 and transferred back into /other/dir:
1211
1212   parallel -S server --trc /other/dir/./{/}.out \
1213     cp {/} {/}.out ::: in/./*
1214
1215
1216 =head2 EXAMPLE: Distributing work to local and remote computers
1217
1218 Convert *.mp3 to *.ogg running one process per CPU on local computer
1219 and server2:
1220
1221   parallel --trc {.}.ogg -S server2,: \
1222     'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3
1223
1224
1225 =head2 EXAMPLE: Running the same command on remote computers
1226
1227 To run the command B<uptime> on remote computers you can do:
1228
1229   parallel --tag --nonall -S server1,server2 uptime
1230
1231 B<--nonall> reads no arguments. If you have a list of jobs you want
1232 to run on each computer you can do:
1233
1234   parallel --tag --onall -S server1,server2 echo ::: 1 2 3
1235
1236 Remove B<--tag> if you do not want the sshlogin added before the
1237 output.
1238
1239 If you have a lot of hosts use '-j0' to access more hosts in parallel.
1240
1241
1242 =head2 EXAMPLE: Running 'sudo' on remote computers
1243
1244 Put the password into passwordfile then run:
1245
1246   parallel --ssh 'cat passwordfile | ssh' --nonall \
1247     -S user@server1,user@server2 sudo -S ls -l /root
1248
1249
1250 =head2 EXAMPLE: Using remote computers behind NAT wall
1251
1252 If the workers are behind a NAT wall, you need some trickery to get to
1253 them.
1254
1255 If you can B<ssh> to a jumphost, and reach the workers from there,
1256 then the obvious solution would be this, but it B<does not work>:
1257
1258   parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
1259
1260 It does not work because the command is dequoted by B<ssh> twice where
1261 as GNU B<parallel> only expects it to be dequoted once.
1262
1263 You can use a bash function and have GNU B<parallel> quote the command:
1264
1265   jumpssh() { ssh -A jumphost ssh $(parallel --shellquote ::: "$@"); }
1266   export -f jumpssh
1267   parallel --ssh jumpssh -S host1 echo ::: this works
1268
1269 Or you can instead put this in B<~/.ssh/config>:
1270
1271   Host host1 host2 host3
1272     ProxyCommand ssh jumphost.domain nc -w 1 %h 22
1273
1274 It requires B<nc(netcat)> to be installed on jumphost. With this you
1275 can simply:
1276
1277   parallel -S host1,host2,host3 echo ::: This does work
1278
1279 =head3 No jumphost, but port forwards
1280
1281 If there is no jumphost but each server has port 22 forwarded from the
1282 firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 = host2,
1283 22003 = host3) then you can use B<~/.ssh/config>:
1284
1285   Host host1.v
1286     Port 22001
1287   Host host2.v
1288     Port 22002
1289   Host host3.v
1290     Port 22003
1291   Host *.v
1292     Hostname firewall
1293
1294 And then use host{1..3}.v as normal hosts:
1295
1296   parallel -S host1.v,host2.v,host3.v echo ::: a b c
1297
1298 =head3 No jumphost, no port forwards
1299
1300 If ports cannot be forwarded, you need some sort of VPN to traverse
1301 the NAT-wall. TOR is one options for that, as it is very easy to get
1302 working.
1303
1304 You need to install TOR and setup a hidden service. In B<torrc> put:
1305
1306   HiddenServiceDir /var/lib/tor/hidden_service/
1307   HiddenServicePort 22 127.0.0.1:22
1308
1309 Then start TOR: B</etc/init.d/tor restart>
1310
1311 The TOR hostname is now in B</var/lib/tor/hidden_service/hostname> and
1312 is something similar to B<izjafdceobowklhz.onion>. Now you simply
1313 prepend B<torsocks> to B<ssh>:
1314
1315   parallel --ssh 'torsocks ssh' -S izjafdceobowklhz.onion \
1316     -S zfcdaeiojoklbwhz.onion,auclucjzobowklhi.onion echo ::: a b c
1317
1318 If not all hosts are accessible through TOR:
1319
1320   parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' \
1321     echo ::: a b c
1322
1323 See more B<ssh> tricks on https://en.wikibooks.org/wiki/OpenSSH/Cookbook/Proxies_and_Jump_Hosts
1324
1325
1326 =head2 EXAMPLE: Use sshpass with ssh
1327
1328 If you cannot use passwordless login, you may be able to use B<sshpass>:
1329
1330   seq 10 | parallel -S user-with-password:MyPassword@server echo
1331
1332 or:
1333
1334   export SSHPASS='MyPa$$w0rd'
1335   seq 10 | parallel -S user-with-password:@server echo
1336
1337
1338 =head2 EXAMPLE: Use outrun instead of ssh
1339
1340 B<outrun> lets you run a command on a remote server. B<outrun> sets up
1341 a connection to access files at the source server, and automatically
1342 transfers files. B<outrun> must be installed on the remote system.
1343
1344 You can use B<outrun> in an sshlogin this way:
1345
1346   parallel -S 'outrun user@server' command
1347
1348 or:
1349
1350   parallel --ssh outrun -S server command
1351
1352
1353 =head2 EXAMPLE: Slurm cluster
1354
1355 The Slurm Workload Manager is used in many clusters.
1356
1357 Here is a simple example of using GNU B<parallel> to call B<srun>:
1358
1359   #!/bin/bash
1360
1361   #SBATCH --time 00:02:00
1362   #SBATCH --ntasks=4
1363   #SBATCH --job-name GnuParallelDemo
1364   #SBATCH --output gnuparallel.out
1365
1366   module purge
1367   module load gnu_parallel
1368
1369   my_parallel="parallel --delay .2 -j $SLURM_NTASKS"
1370   my_srun="srun --export=all --exclusive -n1"
1371   my_srun="$my_srun --cpus-per-task=1 --cpu-bind=cores"
1372   $my_parallel "$my_srun" echo This is job {} ::: {1..20}
1373
1374
1375 =head2 EXAMPLE: Parallelizing rsync
1376
1377 B<rsync> is a great tool, but sometimes it will not fill up the
1378 available bandwidth. Running multiple B<rsync> in parallel can fix
1379 this.
1380
1381   cd src-dir
1382   find . -type f |
1383     parallel -j10 -X rsync -zR -Ha ./{} fooserver:/dest-dir/
1384
1385 Adjust B<-j10> until you find the optimal number.
1386
1387 B<rsync -R> will create the needed subdirectories, so all files are
1388 not put into a single dir. The B<./> is needed so the resulting command
1389 looks similar to:
1390
1391   rsync -zR ././sub/dir/file fooserver:/dest-dir/
1392
1393 The B</./> is what B<rsync -R> works on.
1394
1395 If you are unable to push data, but need to pull them and the files
1396 are called digits.png (e.g. 000000.png) you might be able to do:
1397
1398   seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/
1399
1400
1401 =head2 EXAMPLE: Use multiple inputs in one command
1402
1403 Copy files like foo.es.ext to foo.ext:
1404
1405   ls *.es.* | perl -pe 'print; s/\.es//' | parallel -N2 cp {1} {2}
1406
1407 The perl command spits out 2 lines for each input. GNU B<parallel>
1408 takes 2 inputs (using B<-N2>) and replaces {1} and {2} with the inputs.
1409
1410 Count in binary:
1411
1412   parallel -k echo ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1 ::: 0 1
1413
1414 Print the number on the opposing sides of a six sided die:
1415
1416   parallel --link -a <(seq 6) -a <(seq 6 -1 1) echo
1417   parallel --link echo :::: <(seq 6) <(seq 6 -1 1)
1418
1419 Convert files from all subdirs to PNG-files with consecutive numbers
1420 (useful for making input PNG's for B<ffmpeg>):
1421
1422   parallel --link -a <(find . -type f | sort) \
1423     -a <(seq $(find . -type f|wc -l)) convert {1} {2}.png
1424
1425 Alternative version:
1426
1427   find . -type f | sort | parallel convert {} {#}.png
1428
1429
1430 =head2 EXAMPLE: Use a table as input
1431
1432 Content of table_file.tsv:
1433
1434   foo<TAB>bar
1435   baz <TAB> quux
1436
1437 To run:
1438
1439   cmd -o bar -i foo
1440   cmd -o quux -i baz
1441
1442 you can run:
1443
1444   parallel -a table_file.tsv --colsep '\t' cmd -o {2} -i {1}
1445
1446 Note: The default for GNU B<parallel> is to remove the spaces around
1447 the columns. To keep the spaces:
1448
1449   parallel -a table_file.tsv --trim n --colsep '\t' cmd -o {2} -i {1}
1450
1451
1452 =head2 EXAMPLE: Output to database
1453
1454 GNU B<parallel> can output to a database table and a CSV-file:
1455
1456   dburl=csv:///%2Ftmp%2Fmydir
1457   dbtableurl=$dburl/mytable.csv
1458   parallel --sqlandworker $dbtableurl seq ::: {1..10}
1459
1460 It is rather slow and takes up a lot of CPU time because GNU
1461 B<parallel> parses the whole CSV file for each update.
1462
1463 A better approach is to use an SQLite-base and then convert that to CSV:
1464
1465   dburl=sqlite3:///%2Ftmp%2Fmy.sqlite
1466   dbtableurl=$dburl/mytable
1467   parallel --sqlandworker $dbtableurl seq ::: {1..10}
1468   sql $dburl '.headers on' '.mode csv' 'SELECT * FROM mytable;'
1469
1470 This takes around a second per job.
1471
1472 If you have access to a real database system, such as PostgreSQL, it
1473 is even faster:
1474
1475   dburl=pg://user:pass@host/mydb
1476   dbtableurl=$dburl/mytable
1477   parallel --sqlandworker $dbtableurl seq ::: {1..10}
1478   sql $dburl \
1479     "COPY (SELECT * FROM mytable) TO stdout DELIMITER ',' CSV HEADER;"
1480
1481 Or MySQL:
1482
1483   dburl=mysql://user:pass@host/mydb
1484   dbtableurl=$dburl/mytable
1485   parallel --sqlandworker $dbtableurl seq ::: {1..10}
1486   sql -p -B $dburl "SELECT * FROM mytable;" > mytable.tsv
1487   perl -pe 's/"/""/g; s/\t/","/g; s/^/"/; s/$/"/;
1488     %s=("\\" => "\\", "t" => "\t", "n" => "\n");
1489     s/\\([\\tn])/$s{$1}/g;' mytable.tsv
1490
1491
1492 =head2 EXAMPLE: Output to CSV-file for R
1493
1494 If you have no need for the advanced job distribution control that a
1495 database provides, but you simply want output into a CSV file that you
1496 can read into R or LibreCalc, then you can use B<--results>:
1497
1498   parallel --results my.csv seq ::: 10 20 30
1499   R
1500   > mydf <- read.csv("my.csv");
1501   > print(mydf[2,])
1502   > write(as.character(mydf[2,c("Stdout")]),'')
1503
1504
1505 =head2 EXAMPLE: Use XML as input
1506
1507 The show Aflyttet on Radio 24syv publishes an RSS feed with their audio
1508 podcasts on: http://arkiv.radio24syv.dk/audiopodcast/channel/4466232
1509
1510 Using B<xpath> you can extract the URLs for 2019 and download them
1511 using GNU B<parallel>:
1512
1513   wget -O - http://arkiv.radio24syv.dk/audiopodcast/channel/4466232 | \
1514     xpath -e "//pubDate[contains(text(),'2019')]/../enclosure/@url" | \
1515     parallel -u wget '{= s/ url="//; s/"//; =}'
1516
1517
1518 =head2 EXAMPLE: Run the same command 10 times
1519
1520 If you want to run the same command with the same arguments 10 times
1521 in parallel you can do:
1522
1523   seq 10 | parallel -n0 my_command my_args
1524
1525
1526 =head2 EXAMPLE: Working as cat | sh. Resource inexpensive jobs and evaluation
1527
1528 GNU B<parallel> can work similar to B<cat | sh>.
1529
1530 A resource inexpensive job is a job that takes very little CPU, disk
1531 I/O and network I/O. Ping is an example of a resource inexpensive
1532 job. wget is too - if the webpages are small.
1533
1534 The content of the file jobs_to_run:
1535
1536   ping -c 1 10.0.0.1
1537   wget http://example.com/status.cgi?ip=10.0.0.1
1538   ping -c 1 10.0.0.2
1539   wget http://example.com/status.cgi?ip=10.0.0.2
1540   ...
1541   ping -c 1 10.0.0.255
1542   wget http://example.com/status.cgi?ip=10.0.0.255
1543
1544 To run 100 processes simultaneously do:
1545
1546   parallel -j 100 < jobs_to_run
1547
1548 As there is not a I<command> the jobs will be evaluated by the shell.
1549
1550
1551 =head2 EXAMPLE: Call program with FASTA sequence
1552
1553 FASTA files have the format:
1554
1555   >Sequence name1
1556   sequence
1557   sequence continued
1558   >Sequence name2
1559   sequence
1560   sequence continued
1561   more sequence
1562
1563 To call B<myprog> with the sequence as argument run:
1564
1565   cat file.fasta |
1566     parallel --pipe -N1 --recstart '>' --rrs \
1567       'read a; echo Name: "$a"; myprog $(tr -d "\n")'
1568
1569
1570 =head2 EXAMPLE: Call program with interleaved FASTQ records
1571
1572 FASTQ files have the format:
1573
1574   @M10991:61:000000000-A7EML:1:1101:14011:1001 1:N:0:28
1575   CTCCTAGGTCGGCATGATGGGGGAAGGAGAGCATGGGAAGAAATGAGAGAGTAGCAAGG
1576   +
1577   #8BCCGGGGGFEFECFGGGGGGGGG@;FFGGGEG@FF<EE<@FFC,CEGCCGGFF<FGF
1578
1579 Interleaved FASTQ starts with a line like these:
1580
1581   @HWUSI-EAS100R:6:73:941:1973#0/1
1582   @EAS139:136:FC706VJ:2:2104:15343:197393 1:Y:18:ATCACG
1583   @EAS139:136:FC706VJ:2:2104:15343:197393 1:N:18:1
1584
1585 where '/1' and ' 1:' determines this is read 1.
1586
1587 This will cut big.fq into one chunk per CPU thread and pass it on
1588 stdin (standard input) to the program fastq-reader:
1589
1590   parallel --pipe-part -a big.fq --block -1 --regexp \
1591     --recend '\n' --recstart '@.*(/1| 1:.*)\n[A-Za-z\n\.~]' \
1592     fastq-reader
1593
1594
1595 =head2 EXAMPLE: Processing a big file using more CPUs
1596
1597 To process a big file or some output you can use B<--pipe> to split up
1598 the data into blocks and pipe the blocks into the processing program.
1599
1600 If the program is B<gzip -9> you can do:
1601
1602   cat bigfile | parallel --pipe --recend '' -k gzip -9 > bigfile.gz
1603
1604 This will split B<bigfile> into blocks of 1 MB and pass that to B<gzip
1605 -9> in parallel. One B<gzip> will be run per CPU. The output of B<gzip
1606 -9> will be kept in order and saved to B<bigfile.gz>
1607
1608 B<gzip> works fine if the output is appended, but some processing does
1609 not work like that - for example sorting. For this GNU B<parallel> can
1610 put the output of each command into a file. This will sort a big file
1611 in parallel:
1612
1613   cat bigfile | parallel --pipe --files sort |\
1614     parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
1615
1616 Here B<bigfile> is split into blocks of around 1MB, each block ending
1617 in '\n' (which is the default for B<--recend>). Each block is passed
1618 to B<sort> and the output from B<sort> is saved into files. These
1619 files are passed to the second B<parallel> that runs B<sort -m> on the
1620 files before it removes the files. The output is saved to
1621 B<bigfile.sort>.
1622
1623 GNU B<parallel>'s B<--pipe> maxes out at around 100 MB/s because every
1624 byte has to be copied through GNU B<parallel>. But if B<bigfile> is a
1625 real (seekable) file GNU B<parallel> can by-pass the copying and send
1626 the parts directly to the program:
1627
1628   parallel --pipe-part --block 100m -a bigfile --files sort |\
1629     parallel -Xj1 sort -m {} ';' rm {} >bigfile.sort
1630
1631
1632 =head2 EXAMPLE: Grouping input lines
1633
1634 When processing with B<--pipe> you may have lines grouped by a
1635 value. Here is I<my.csv>:
1636
1637    Transaction Customer Item
1638         1       a       53
1639         2       b       65
1640         3       b       82
1641         4       c       96
1642         5       c       67
1643         6       c       13
1644         7       d       90
1645         8       d       43
1646         9       d       91
1647         10      d       84
1648         11      e       72
1649         12      e       102
1650         13      e       63
1651         14      e       56
1652         15      e       74
1653
1654 Let us assume you want GNU B<parallel> to process each customer. In
1655 other words: You want all the transactions for a single customer to be
1656 treated as a single record.
1657
1658 To do this we preprocess the data with a program that inserts a record
1659 separator before each customer (column 2 = $F[1]). Here we first make
1660 a 50 character random string, which we then use as the separator:
1661
1662   sep=`perl -e 'print map { ("a".."z","A".."Z")[rand(52)] } (1..50);'`
1663   cat my.csv | \
1664      perl -ape '$F[1] ne $l and print "'$sep'"; $l = $F[1]' | \
1665      parallel --recend $sep --rrs --pipe -N1 wc
1666
1667 If your program can process multiple customers replace B<-N1> with a
1668 reasonable B<--blocksize>.
1669
1670
1671 =head2 EXAMPLE: Running more than 250 jobs workaround
1672
1673 If you need to run a massive amount of jobs in parallel, then you will
1674 likely hit the filehandle limit which is often around 250 jobs. If you
1675 are super user you can raise the limit in /etc/security/limits.conf
1676 but you can also use this workaround. The filehandle limit is per
1677 process. That means that if you just spawn more GNU B<parallel>s then
1678 each of them can run 250 jobs. This will spawn up to 2500 jobs:
1679
1680   cat myinput |\
1681     parallel --pipe -N 50 --round-robin -j50 parallel -j50 your_prg
1682
1683 This will spawn up to 62500 jobs (use with caution - you need 64 GB
1684 RAM to do this, and you may need to increase /proc/sys/kernel/pid_max):
1685
1686   cat myinput |\
1687     parallel --pipe -N 250 --round-robin -j250 parallel -j250 your_prg
1688
1689
1690 =head2 EXAMPLE: Working as mutex and counting semaphore
1691
1692 The command B<sem> is an alias for B<parallel --semaphore>.
1693
1694 A counting semaphore will allow a given number of jobs to be started
1695 in the background.  When the number of jobs are running in the
1696 background, GNU B<sem> will wait for one of these to complete before
1697 starting another command. B<sem --wait> will wait for all jobs to
1698 complete.
1699
1700 Run 10 jobs concurrently in the background:
1701
1702   for i in *.log ; do
1703     echo $i
1704     sem -j10 gzip $i ";" echo done
1705   done
1706   sem --wait
1707
1708 A mutex is a counting semaphore allowing only one job to run. This
1709 will edit the file I<myfile> and prepends the file with lines with the
1710 numbers 1 to 3.
1711
1712   seq 3 | parallel sem sed -i -e '1i{}' myfile
1713
1714 As I<myfile> can be very big it is important only one process edits
1715 the file at the same time.
1716
1717 Name the semaphore to have multiple different semaphores active at the
1718 same time:
1719
1720   seq 3 | parallel sem --id mymutex sed -i -e '1i{}' myfile
1721
1722
1723 =head2 EXAMPLE: Mutex for a script
1724
1725 Assume a script is called from cron or from a web service, but only
1726 one instance can be run at a time. With B<sem> and B<--shebang-wrap>
1727 the script can be made to wait for other instances to finish. Here in
1728 B<bash>:
1729
1730   #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /bin/bash
1731
1732   echo This will run
1733   sleep 5
1734   echo exclusively
1735
1736 Here B<perl>:
1737
1738   #!/usr/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/perl
1739
1740   print "This will run ";
1741   sleep 5;
1742   print "exclusively\n";
1743
1744 Here B<python>:
1745
1746   #!/usr/local/bin/sem --shebang-wrap -u --id $0 --fg /usr/bin/python
1747
1748   import time
1749   print "This will run ";
1750   time.sleep(5)
1751   print "exclusively";
1752
1753
1754 =head2 EXAMPLE: Start editor with file names from stdin (standard input)
1755
1756 You can use GNU B<parallel> to start interactive programs like emacs or vi:
1757
1758   cat filelist | parallel --tty -X emacs
1759   cat filelist | parallel --tty -X vi
1760
1761 If there are more files than will fit on a single command line, the
1762 editor will be started again with the remaining files.
1763
1764
1765 =head2 EXAMPLE: Running sudo
1766
1767 B<sudo> requires a password to run a command as root. It caches the
1768 access, so you only need to enter the password again if you have not
1769 used B<sudo> for a while.
1770
1771 The command:
1772
1773   parallel sudo echo ::: This is a bad idea
1774
1775 is no good, as you would be prompted for the sudo password for each of
1776 the jobs. Instead do:
1777
1778   sudo parallel echo ::: This is a good idea
1779
1780 This way you only have to enter the sudo password once.
1781
1782 =head2 EXAMPLE: Run ping in parallel
1783
1784 B<ping> prints out statistics when killed with CTRL-C.
1785
1786 Unfortunately, CTRL-C will also normally kill GNU B<parallel>.
1787
1788 But by using B<--open-tty> and ignoring SIGINT you can get the wanted effect:
1789
1790   parallel -j0 --open-tty --lb --tag ping '{= $SIG{INT}=sub {} =}' \
1791     ::: 1.1.1.1 8.8.8.8 9.9.9.9 21.21.21.21 80.80.80.80 88.88.88.88
1792
1793 B<--open-tty> will make the B<ping>s receive SIGINT (from CTRL-C).
1794 CTRL-C will not kill GNU B<parallel>, so that will only exit after
1795 B<ping> is done.
1796
1797
1798 =head2 EXAMPLE: GNU Parallel as queue system/batch manager
1799
1800 GNU B<parallel> can work as a simple job queue system or batch manager.
1801 The idea is to put the jobs into a file and have GNU B<parallel> read
1802 from that continuously. As GNU B<parallel> will stop at end of file we
1803 use B<tail> to continue reading:
1804
1805   true >jobqueue; tail -n+0 -f jobqueue | parallel
1806
1807 To submit your jobs to the queue:
1808
1809   echo my_command my_arg >> jobqueue
1810
1811 You can of course use B<-S> to distribute the jobs to remote
1812 computers:
1813
1814   true >jobqueue; tail -n+0 -f jobqueue | parallel -S ..
1815
1816 Output only will be printed when reading the next input after a job
1817 has finished: So you need to submit a job after the first has finished
1818 to see the output from the first job.
1819
1820 If you keep this running for a long time, jobqueue will grow. A way of
1821 removing the jobs already run is by making GNU B<parallel> stop when
1822 it hits a special value and then restart. To use B<--eof> to make GNU
1823 B<parallel> exit, B<tail> also needs to be forced to exit:
1824
1825   true >jobqueue;
1826   while true; do
1827     tail -n+0 -f jobqueue |
1828       (parallel -E StOpHeRe -S ..; echo GNU Parallel is now done;
1829        perl -e 'while(<>){/StOpHeRe/ and last};print <>' jobqueue > j2;
1830        (seq 1000 >> jobqueue &);
1831        echo Done appending dummy data forcing tail to exit)
1832     echo tail exited;
1833     mv j2 jobqueue
1834   done
1835
1836 In some cases you can run on more CPUs and computers during the night:
1837
1838   # Day time
1839   echo 50% > jobfile
1840   cp day_server_list ~/.parallel/sshloginfile
1841   # Night time
1842   echo 100% > jobfile
1843   cp night_server_list ~/.parallel/sshloginfile
1844   tail -n+0 -f jobqueue | parallel --jobs jobfile -S ..
1845
1846 GNU B<parallel> discovers if B<jobfile> or B<~/.parallel/sshloginfile>
1847 changes.
1848
1849
1850 =head2 EXAMPLE: GNU Parallel as dir processor
1851
1852 If you have a dir in which users drop files that needs to be processed
1853 you can do this on GNU/Linux (If you know what B<inotifywait> is
1854 called on other platforms file a bug report):
1855
1856   inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
1857     parallel -u echo
1858
1859 This will run the command B<echo> on each file put into B<my_dir> or
1860 subdirs of B<my_dir>.
1861
1862 You can of course use B<-S> to distribute the jobs to remote
1863 computers:
1864
1865   inotifywait -qmre MOVED_TO -e CLOSE_WRITE --format %w%f my_dir |\
1866     parallel -S ..  -u echo
1867
1868 If the files to be processed are in a tar file then unpacking one file
1869 and processing it immediately may be faster than first unpacking all
1870 files. Set up the dir processor as above and unpack into the dir.
1871
1872 Using GNU B<parallel> as dir processor has the same limitations as
1873 using GNU B<parallel> as queue system/batch manager.
1874
1875
1876 =head2 EXAMPLE: Locate the missing package
1877
1878 If you have downloaded source and tried compiling it, you may have seen:
1879
1880   $ ./configure
1881   [...]
1882   checking for something.h... no
1883   configure: error: "libsomething not found"
1884
1885 Often it is not obvious which package you should install to get that
1886 file. Debian has `apt-file` to search for a file. `tracefile` from
1887 https://codeberg.org/tange/tangetools can tell which files a program
1888 tried to access. In this case we are interested in one of the last
1889 files:
1890
1891   $ tracefile -un ./configure | tail | parallel -j0 apt-file search
1892
1893
1894 =head1 AUTHOR
1895
1896 When using GNU B<parallel> for a publication please cite:
1897
1898 O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
1899 The USENIX Magazine, February 2011:42-47.
1900
1901 This helps funding further development; and it won't cost you a cent.
1902 If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
1903
1904 Copyright (C) 2007-10-18 Ole Tange, http://ole.tange.dk
1905
1906 Copyright (C) 2008-2010 Ole Tange, http://ole.tange.dk
1907
1908 Copyright (C) 2010-2024 Ole Tange, http://ole.tange.dk and Free
1909 Software Foundation, Inc.
1910
1911 Parts of the manual concerning B<xargs> compatibility is inspired by
1912 the manual of B<xargs> from GNU findutils 4.4.2.
1913
1914
1915 =head1 LICENSE
1916
1917 This program is free software; you can redistribute it and/or modify
1918 it under the terms of the GNU General Public License as published by
1919 the Free Software Foundation; either version 3 of the License, or
1920 at your option any later version.
1921
1922 This program is distributed in the hope that it will be useful,
1923 but WITHOUT ANY WARRANTY; without even the implied warranty of
1924 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
1925 GNU General Public License for more details.
1926
1927 You should have received a copy of the GNU General Public License
1928 along with this program.  If not, see <https://www.gnu.org/licenses/>.
1929
1930 =head2 Documentation license I
1931
1932 Permission is granted to copy, distribute and/or modify this
1933 documentation under the terms of the GNU Free Documentation License,
1934 Version 1.3 or any later version published by the Free Software
1935 Foundation; with no Invariant Sections, with no Front-Cover Texts, and
1936 with no Back-Cover Texts.  A copy of the license is included in the
1937 file LICENSES/GFDL-1.3-or-later.txt.
1938
1939 =head2 Documentation license II
1940
1941 You are free:
1942
1943 =over 9
1944
1945 =item B<to Share>
1946
1947 to copy, distribute and transmit the work
1948
1949 =item B<to Remix>
1950
1951 to adapt the work
1952
1953 =back
1954
1955 Under the following conditions:
1956
1957 =over 9
1958
1959 =item B<Attribution>
1960
1961 You must attribute the work in the manner specified by the author or
1962 licensor (but not in any way that suggests that they endorse you or
1963 your use of the work).
1964
1965 =item B<Share Alike>
1966
1967 If you alter, transform, or build upon this work, you may distribute
1968 the resulting work only under the same, similar or a compatible
1969 license.
1970
1971 =back
1972
1973 With the understanding that:
1974
1975 =over 9
1976
1977 =item B<Waiver>
1978
1979 Any of the above conditions can be waived if you get permission from
1980 the copyright holder.
1981
1982 =item B<Public Domain>
1983
1984 Where the work or any of its elements is in the public domain under
1985 applicable law, that status is in no way affected by the license.
1986
1987 =item B<Other Rights>
1988
1989 In no way are any of the following rights affected by the license:
1990
1991 =over 2
1992
1993 =item *
1994
1995 Your fair dealing or fair use rights, or other applicable
1996 copyright exceptions and limitations;
1997
1998 =item *
1999
2000 The author's moral rights;
2001
2002 =item *
2003
2004 Rights other persons may have either in the work itself or in
2005 how the work is used, such as publicity or privacy rights.
2006
2007 =back
2008
2009 =back
2010
2011 =over 9
2012
2013 =item B<Notice>
2014
2015 For any reuse or distribution, you must make clear to others the
2016 license terms of this work.
2017
2018 =back
2019
2020 A copy of the full license is included in the file as
2021 LICENCES/CC-BY-SA-4.0.txt
2022
2023
2024 =head1 SEE ALSO
2025
2026 B<parallel>(1), B<parallel_tutorial>(7), B<env_parallel>(1),
2027 B<parset>(1), B<parsort>(1), B<parallel_alternatives>(7),
2028 B<parallel_design>(7), B<niceload>(1), B<sql>(1), B<ssh>(1),
2029 B<ssh-agent>(1), B<sshpass>(1), B<ssh-copy-id>(1), B<rsync>(1)
2030
2031 =cut