3 # SPDX-FileCopyrightText: 2021-2024 Ole Tange, http://ole.tange.dk and Free Software and Foundation, Inc.
4 # SPDX-License-Identifier: GFDL-1.3-or-later
5 # SPDX-License-Identifier: CC-BY-SA-4.0
9 =head1 Why should you read this book?
11 If you write shell scripts to do the same processing for different
12 input, then GNU B<parallel> will make your life easier and make your
15 The book is written so you get the juicy parts first: The goal is that
16 you read just enough to get you going. GNU B<parallel> has an
17 overwhelming amount of special features to help in different
18 situations, and to avoid overloading you with information, the most
19 used features are presented first.
21 All the examples are tested in Bash, and most will work in other
22 shells, too, but there are a few exceptions. So you are recommended to
23 use Bash while testing out the examples.
26 =head1 Learn GNU Parallel in 5 minutes
28 You just need to run commands in parallel. You do not care about fine
31 To get going please run this to make some example files:
33 # If your system does not have 'seq', replace 'seq' with 'jot'
34 seq 5 | parallel seq {} '>' example.{}
38 GNU B<parallel> reads values from input sources. One input source is
39 the command line. The values are put after B<:::> :
41 parallel echo ::: 1 2 3 4 5
43 This makes it easy to run the same program on some files:
45 parallel wc ::: example.*
47 If you give multiple B<:::>s, GNU B<parallel> will generate all
50 parallel wc ::: -l -c ::: example.*
52 GNU B<parallel> can also read the values from stdin (standard input):
57 =head2 Building the command line
59 The command line is put before the B<:::>. It can contain contain a
60 command and options for the command:
62 parallel wc -l ::: example.*
64 The command can contain multiple programs. Just remember to quote
65 characters that are interpreted by the shell (such as B<;>):
67 parallel echo counting lines';' wc -l ::: example.*
69 The value will normally be appended to the command, but can be placed
70 anywhere by using the replacement string B<{}>:
72 parallel echo counting {}';' wc -l {} ::: example.*
74 When using multiple input sources you use the positional replacement
75 strings B<{1}> and B<{2}>:
77 parallel echo count {1} in {2}';' wc {1} {2} ::: -l -c ::: example.*
79 You can check what will be run with B<--dry-run>:
81 parallel --dry-run echo count {1} in {2}';' wc {1} {2} ::: -l -c ::: example.*
83 This is a good idea to do for every command until you are comfortable
86 =head2 Controlling the output
88 The output will be printed as soon as the command completes. This
89 means the output may come in a different order than the input:
91 parallel sleep {}';' echo {} done ::: 5 4 3 2 1
93 You can force GNU B<parallel> to print in the order of the values with
94 B<--keep-order>/B<-k>. This will still run the commands in parallel.
95 The output of the later jobs will be delayed, until the earlier jobs
98 parallel -k sleep {}';' echo {} done ::: 5 4 3 2 1
101 =head2 Controlling the execution
103 If your jobs are compute intensive, you will most likely run one job
104 for each core in the system. This is the default for GNU B<parallel>.
106 But sometimes you want more jobs running. You control the number of
107 job slots with B<-j>. Give B<-j> the number of jobs to run in
111 wget https://ftpmirror.gnu.org/parallel/parallel-{1}{2}22.tar.bz2 \
112 ::: 2012 2013 2014 2015 2016 \
113 ::: 01 02 03 04 05 06 07 08 09 10 11 12
118 GNU B<parallel> can also pass blocks of data to commands on stdin
121 seq 1000000 | parallel --pipe wc
123 This can be used to process big text files. By default GNU B<parallel>
124 splits on \n (newline) and passes a block of around 1 MB to each job.
129 You have now learned the basic use of GNU B<parallel>. This will
130 probably cover most cases of your use of GNU B<parallel>.
132 The rest of this document will go into more details on each of the
133 sections and cover special use cases.
136 =head1 Learn GNU Parallel in an hour
138 In this part we will dive deeper into what you learned in the first 5 minutes.
140 To get going please run this to make some example files:
147 On top of the command line, input sources can also be stdin (standard
148 input or '-'), files and fifos and they can be mixed. Files are given
149 after B<-a> or B<::::>. So these all do the same:
151 parallel echo Dice1={1} Dice2={2} ::: 1 2 3 4 5 6 ::: 6 5 4 3 2 1
152 parallel echo Dice1={1} Dice2={2} :::: <(seq 6) :::: <(seq 6 -1 1)
153 parallel echo Dice1={1} Dice2={2} :::: seq6 seq-6
154 parallel echo Dice1={1} Dice2={2} :::: seq6 :::: seq-6
155 parallel -a seq6 -a seq-6 echo Dice1={1} Dice2={2}
156 parallel -a seq6 echo Dice1={1} Dice2={2} :::: seq-6
157 parallel echo Dice1={1} Dice2={2} ::: 1 2 3 4 5 6 :::: seq-6
158 cat seq-6 | parallel echo Dice1={1} Dice2={2} :::: seq6 -
160 If stdin (standard input) is the only input source, you do not need the '-':
162 cat seq6 | parallel echo Dice1={1}
164 =head3 Linking input sources
166 You can link multiple input sources with B<:::+> and B<::::+>:
168 parallel echo {1}={2} ::: I II III IV V VI :::+ 1 2 3 4 5 6
169 parallel echo {1}={2} ::: I II III IV V VI ::::+ seq6
171 The B<:::+> (and B<::::+>) will link each value to the corresponding
172 value in the previous input source, so value number 3 from the first
173 input source will be linked to value number 3 from the second input
176 You can combine B<:::+> and B<:::>, so you link 2 input sources, but
177 generate all combinations with other input sources:
179 parallel echo Dice1={1}={2} Dice2={3}={4} ::: I II III IV V VI ::::+ seq6 \
180 ::: VI V IV III II I ::::+ seq-6
183 =head2 Building the command line
187 The command can be a script, a binary or a Bash function if the
188 function is exported using B<export -f>:
195 parallel my_func ::: 1 2 3
197 If the command is complex, it often improves readability to make it
201 =head3 The replacement strings
203 GNU B<parallel> has some replacement strings to make it easier to
204 refer to the input read from the input sources.
206 If the input is B<mydir/mysubdir/myfile.myext> then:
208 {} = mydir/mysubdir/myfile.myext
209 {.} = mydir/mysubdir/myfile
211 {//} = mydir/mysubdir
213 {#} = the sequence number of the job
214 {%} = the job slot number
216 When a job is started it gets a sequence number that starts at 1 and
217 increases by 1 for each new job. The job also gets assigned a slot
218 number. This number is from 1 to the number of jobs running in
219 parallel. It is unique between the running jobs, but is re-used as
220 soon as a job finishes.
222 =head4 The positional replacement strings
224 The replacement strings have corresponding positional replacement
225 strings. If the value from the 3rd input source is
226 B<mydir/mysubdir/myfile.myext>:
228 {3} = mydir/mysubdir/myfile.myext
229 {3.} = mydir/mysubdir/myfile
231 {3//} = mydir/mysubdir
234 So the number of the input source is simply prepended inside the {}'s.
237 =head1 Replacement strings
239 --plus replacement strings
241 change the replacement string (-I --extensionreplace --basenamereplace --basenamereplace --dirnamereplace --basenameextensionreplace --seqreplace --slotreplace
243 --header with named replacement string
247 Dynamic replacement strings
249 =head2 Defining replacement strings
254 =head2 Copying environment
258 =head2 Controlling the output
262 B<parset> is a shell function to get the output from GNU B<parallel>
263 into shell variables.
265 B<parset> is fully supported for B<Bash/Zsh/Ksh> and partially supported
266 for B<ash/dash>. I will assume you run B<Bash>.
268 To activate B<parset> you have to run:
270 . `which env_parallel.bash`
272 (replace B<bash> with your shell's name).
276 parset a,b,c seq ::: 4 5 6
281 parset 'a b c' seq ::: 4 5 6
284 If you give a single variable, this will become an array:
286 parset arr seq ::: 4 5 6
289 B<parset> has one limitation: If it reads from a pipe, the output will
292 echo This will not work | parset myarr echo
293 echo Nothing: "${myarr[*]}"
295 Instead you can do this:
297 echo This will work > tempfile
298 parset myarr echo < tempfile
305 =head2 Controlling the execution
309 =head2 Remote execution
311 For this section you must have B<ssh> access with no password to 2
312 servers: B<$server1> and B<$server2>.
314 server1=server.example.com
315 server2=server2.example.net
317 So you must be able to do this:
319 ssh $server1 echo works
320 ssh $server2 echo works
322 It can be setup by running 'ssh-keygen -t dsa; ssh-copy-id $server1'
323 and using an empty passphrase. Or you can use B<ssh-agent>.
327 =head3 --transferfile
329 B<--transferfile> I<filename> will transfer I<filename> to the
330 worker. I<filename> can contain a replacement string:
332 parallel -S $server1,$server2 --transferfile {} wc ::: example.*
333 parallel -S $server1,$server2 --transferfile {2} \
334 echo count {1} in {2}';' wc {1} {2} ::: -l -c ::: example.*
336 A shorthand for B<--transferfile {}> is B<--transfer>.
344 A shorthand for B<--transfer --return {} --cleanup> is B<--trc {}>.
354 =head1 Advanced usage
356 parset fifo, cmd substitution, arrayelements, array with var names and cmds, env_parset
363 Interfacing with JSON/jq
366 board="$(printf -- '%s' "${1}" | cut -d '/' -f4)"
367 thread="$(printf -- '%s' "${1}" | cut -d '/' -f6)"
368 wget -qO- "https://a.4cdn.org/${board}/thread/${thread}.json" |
371 | map(select(.tim != null))
372 | map((.tim | tostring) + .ext)
373 | map("https://i.4cdn.org/'"${board}"'/"+.)[]
375 parallel --gnu -j 0 wget -nv
378 Interfacing with XML/?
380 Interfacing with HTML/?
382 =head2 Controlling the execution
387 =head2 Remote execution
389 seq 10 | parallel --sshlogin 'ssh -i "key.pem" a@b.com' echo
391 seq 10 | PARALLEL_SSH='ssh -i "key.pem"' parallel --sshlogin a@b.com echo
393 seq 10 | parallel --ssh 'ssh -i "key.pem"' --sshlogin a@b.com echo
397 The sshlogin file format
399 Check if servers are up