29 lines
1.1 KiB
Plaintext
29 lines
1.1 KiB
Plaintext
This program is used to split up data from stdin in blocks which are
|
|
sent as input to parallel invocations of commands. The output from
|
|
those are then concatenated in the right order and sent to stdout.
|
|
|
|
Splitting up and parallelizing jobs like this might be useful to speed
|
|
up compression using multiple CPU cores or even multiple computers.
|
|
|
|
For this approach to be useful, the compressed format needs to allow
|
|
multiple compressed files to be concatenated. This is the case for
|
|
gzip, bzip2, lzip and xz.
|
|
|
|
Example 1, use multiple logical cores:
|
|
splitjob -j 4 bzip2 < bigfile > bigfile.bz2
|
|
|
|
Example 2, use remote machines:
|
|
splitjob "ssh host1 gzip" "ssh host2 gzip" < f > f.gz
|
|
|
|
The above example assumes that ssh is configured to allow logins
|
|
without asking for password. See the manpage for ssh-keygen or do
|
|
a google search for examples on how to accomplish this.
|
|
|
|
Example 3, Use bigger blocks to reduce overhead:
|
|
splitjob -j 2 -b 10M gzip < file > file.gz
|
|
|
|
For "xz -9" a block size of 384 MB gives best compression.
|
|
|
|
Example 4, parallel decompression:
|
|
splitjob -X -r 10 -j 10 -b 384M "xz -d -" < file.xz > file
|