With GNU parallel we can quickly parallelize any shell command. The documentation is extensive, so the following are my getting started notes.
In general, the syntax is:
parallel [parallel-options] [command] [command-arguments] [::: input]
The :::
delimiter denotes the beginning of the inputs to the command.
Per default, this is a space-separated list where each element is passed to one run of the command.
We will run the command as many (per default space-separated) times, as we have an input element.
For example:
parallel --max-procs 2 echo ::: 1 2 3 4
runs echo four times, with two job slots that can be run in parallel.
Instead of using :::
and writing out all the input, we can create a sequence and pipe it using seq
:
seq 1 10 | parallel --max-procs 2 echo "Running."
Some more tidbits:
- We can use
{}
to refer to the input:
parallel --max-procs 2 echo "Run {}" ::: 1 2 3 4
- Use
--verbose
, so that you can exactly see how parallel calls the command. This may be important, because parallel may have to quote the arguments - if that’s necessary, use--quote
. - You may want to simply run the command several times, but not pass any of the input to it.
In that case, use
--max-args 0
. - You can make parallel precede each output line by the input arguments with
--tag
.
I used parallel
to make the same curl
request several times.
The following curl parameters were useful:
--output /dev/null
: don’t show the response body--silent --show-error
: don’t show a progress bar--write-out "Total time: %{time_total}s\n"
: write out some information about the request. See the man page for all available information.
So, in the end:
seq 1 4 | parallel \
--max-args 0 \
--max-procs 2 \
--quote \
curl \
--output /dev/null \
--silent --show-error \
--write-out "Total time: %{time_total} - response code: %{response_code}\n" \
www.google.de