2011-12-09 22:25:20 +00:00
Luk filen ved EOF - lad være med bare at læse videre.
2012-09-07 16:51:08 +00:00
> /tmp/ged; tail -f /tmp/ged| xargs -n1 -E eof & sleep 1; echo echo a >>/tmp/ged; echo eof >>/tmp/ged; seq 4 >>/tmp/ged; wait
2011-12-09 22:25:20 +00:00
2012-09-07 16:51:08 +00:00
> /tmp/ged; tail -f /tmp/ged| parallel -n1 -E eof & sleep 1; echo echo a >>/tmp/ged; echo eof >>/tmp/ged; seq 4 >>/tmp/ged; wait
2011-12-09 22:25:20 +00:00
2011-06-11 23:19:29 +00:00
niceload seeks last column:
iostat -x 1 2
niceload --start-condition
2011-11-22 21:38:57 +00:00
niceload should prioritize jobs and only unsuspend the highest
priority job. If the running job with lowest pri has run for 1 sec:
Consider unsuspending next pri job.
2011-06-11 23:19:29 +00:00
2011-06-05 16:27:50 +00:00
Til QUOTING:
2011-05-20 23:07:45 +00:00
2011-09-09 19:15:00 +00:00
cat <<'_EOF' | parallel -v echo
awk -v FS="\",\"" '{print $1, $3, $4, $5, $9, $14}' | grep -v "#" | sed -e '1d' -e 's/\"//g' -e 's/\/\/\//\t/g' | cut -f1-6,11 | sed -e 's/\/\//\t/g' -e 's/ /\t/g
_EOF
2011-06-05 16:27:50 +00:00
FN="two spaces"
echo 1 | parallel -q echo {} "$FN"
# Prints 2 spaces between 'two' and 'spaces'
2011-05-20 23:07:45 +00:00
2011-06-05 16:27:50 +00:00
-q will not work with composed commands as it will quote the ; as
well. So composed commands have to be quoted by hand:
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
# Using export:
FN2="two spaces"
export FN2
echo 1 | parallel echo {} \"\$FN2\" \; echo \"\$FN2\" {}
# Prints 2 spaces between 'two' and 'spaces'
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
# Without export:
FN3="two spaces"
echo 1 | parallel echo {} \""$FN3"\" \; echo \'"$FN3"\' {}
2011-05-20 23:07:45 +00:00
2011-06-05 16:27:50 +00:00
# By quoting the space in the variable
FN4='two\ \ spaces'
echo 1 | parallel echo {} $FN4 \; echo $FN4 {}
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
= Bug? ==
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
locate .gz | parallel -X find {} -size +1000 -size -2000 | parallel --workdir ... -S .. --trc {/}.bz2 'zcat {} | bzip2 > {/}.bz2'
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
== Compare ==
2011-05-05 18:50:53 +00:00
2013-07-03 22:50:08 +00:00
Unchanged since 2008 http://code.google.com/p/spawntool/
Unchanged since 2011 http://code.google.com/p/push/
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
== Bug? ==
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
.parallel/config with --long-options
2011-05-05 18:50:53 +00:00
2011-05-20 23:07:45 +00:00
2011-06-05 16:27:50 +00:00
== SQL ==
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
Example with %0a as newline
sql :my_postgres?'\dt %0a SELECT * FROM users'
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
cat ~/.sql/aliases | parallel --colsep '\s' sql {1} '"select 0.14+3;" | grep -q 3.14 || (echo dead: {1}; exit 1)'
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
== FEX ==
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
fex syntax for splitting fields
http://www.semicomplete.com/projects/fex/
sql :foo 'select * from bar' | parallel --fex '|{1,2}' do_stuff {2} {1}
2011-05-20 23:07:45 +00:00
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
--autocolsep: Læs alle linjer.
Prøv fastlængde: Find tegn, som står i alle linjer på de samme pladser. Risiko for falske pos
Prøv fieldsep: Find eet tegn, som optræder det samme antal gange i alle linjer (tab sep)
Prøv klyngesep: Find den samme klynge tegn, som står samme antal gange i alle linjer (' | ' sep)
Fjern whitespace før og efter colonne
2011-05-05 18:50:53 +00:00
2011-06-05 16:27:50 +00:00
hvis der er n af tegn A og 2n af tegn B, så
2011-05-05 21:36:12 +00:00
2011-06-05 16:27:50 +00:00
a | b | c
2011-05-05 21:36:12 +00:00
2011-06-05 16:27:50 +00:00
Simpleste: tab sep
2011-05-05 21:36:12 +00:00
2011-06-05 16:27:50 +00:00
for hver linje
max,min count for hver char
2011-05-05 21:36:12 +00:00
2011-06-05 16:27:50 +00:00
for hver char
if max == min :
potentiel
min_potentiel = min(min_potentiel,min)
2011-05-05 21:36:12 +00:00
2011-06-05 16:27:50 +00:00
for potentiel:
if min % min_potentiel = 0: sepchars += potentiel,no of sepchars += min / min_potentiel
2011-05-20 23:07:45 +00:00
2011-06-05 16:27:50 +00:00
colsep = [sepchars]{no_of_sepchars}
2011-05-20 23:07:45 +00:00
2011-06-05 16:27:50 +00:00
# TODO compute how many can be transferred within max_line_length
2011-05-05 18:50:53 +00:00
2011-06-11 23:19:29 +00:00
Til inspiration.
2012-02-20 00:48:28 +00:00
Hvis du stadig er ved at lave post- eller visitkort ting, så kunne du evt tilføje en QR code under frimærket. Med MECARD tagget kan flere tags gemmes i en og samme fil:
2011-06-11 23:19:29 +00:00
qrencode -l L -o x.png "MECARD:N:GNU Parallel;EMAIL:parallel@gnu.org;URL:gnu.org/software/parallel;"
Den ser OK ud i en Androide tlf.
Husk at skrive indholdet under billedet, det er irreterende at skulle gætte.
2011-05-03 22:30:56 +00:00
GNU parallel is a UNIX-tool for running commands in parallel.
To gzip all files running one job per CPU write:
parallel gzip ::: *
2012-02-20 00:48:28 +00:00
Watch the intro video to learn more: http://pi.dk/1
Or read more about GNU parallel: http://gnu.org/s/parallel
2011-05-03 22:30:56 +00:00
2011-04-08 19:57:19 +00:00
job->start():
$jobslot = Global::jobslot->$sshlogin
sub get_jobslot {
my $sshlogin = shift;
my $jobslot_id = pop @Global::jobslots{$sshlogin};
if not defined $jobslot_id {
$jobslot_id = ++$Global::max_jobslot_id;
}
return $jobslot_id;
}
sub release_jobslot {
my $sshlogin = shift;
my $jobslot_id = shift;
push @Global::jobslots{$sshlogin}, $jobslot_id;
}
Test sshlogins in parallel. Assume parallel is in path
seq 1 10 | parallel -I {o} 'seq 1 255 | parallel echo ssh -oNoHostAuthenticationForLocalhost=true 127.0.{o}.{}' >/tmp/sshloginfile
seq 1 1000 | parallel --sshloginfile /tmp/sshloginfile echo
ssh -F /tmp/
2011-01-25 23:34:08 +00:00
Example:
Chop mbox into emails
Parallel sort
2011-01-02 00:01:21 +00:00
codecoverage
2010-12-22 09:21:58 +00:00
2011-01-02 00:01:21 +00:00
Testsuite: sem without ~/.parallel
2010-12-21 17:08:16 +00:00
2011-06-05 16:27:50 +00:00
Dont start:
2010-12-19 00:38:36 +00:00
2011-06-05 16:27:50 +00:00
* seek
2010-12-19 00:38:36 +00:00
2011-11-22 21:38:57 +00:00
= Videos =
2010-12-19 00:38:36 +00:00
2011-11-22 21:38:57 +00:00
== Ideas ==
# GNU Parallel 20111122 - The Silvio release
2011-10-17 01:10:32 +00:00
--tag
-Jt (e.g. --tag --nonall)
-Jr (--sshloginfile myremoteservers)
-Jl (--sshloginfile mylocalservers)
-Jt -Jr -Jl
./src/parallel --sshloginfile ~/.parallel/remote --slf ~/.parallel/local --nonall hostname
2011-11-22 21:38:57 +00:00
sem
single file installation
2011-12-09 22:25:20 +00:00
./configure --prefix=$HOME && make && make install
Or if your system lacks 'make' you can simply copy src/parallel
src/sem src/niceload src/sql to a dir in your path.
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
2012-02-20 00:48:28 +00:00
--record measured in lines (useful for fastq)
parallel --tag traceroute ::: pi.dk debian.org
# Thank you for watching
#
# If you like GNU Parallel:
# * Post this video on Reddit/Diaspora*/forums/blogs/
# Identi.ca/Google+/Twitter/Facebook/Linkedin/mailing lists
# * Join the mailing list https://lists.gnu.org/mailman/listinfo/parallel
# * Get the merchandise https://www.gnu.org/s/parallel/merchandise.html
# * Give a demo at your local user group
# * Request or write a review for your favourite blog or magazine
# * Request or build a package for your favourite distribution
# * Invite me for your next conference (Contact http://ole.tange.dk)
#
# If GNU Parallel saves you money:
# * (Have your company) donate to FSF https://my.fsf.org/donate/
#
# If you use GNU Parallel for a publication please cite:
# O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
# The USENIX Magazine, February 2011:42-47.
#
# Find GNU Parallel at https://www.gnu.org/s/parallel/
2011-12-09 22:25:20 +00:00
2010-12-19 00:38:36 +00:00
2011-11-22 21:38:57 +00:00
== Video: GNU Parallel 20110522 - The Pakistan Release ==
2010-12-19 00:38:36 +00:00
2011-06-05 16:27:50 +00:00
# GNU Parallel 20110522 - The Pakistan Release
2010-12-19 00:38:36 +00:00
2011-06-05 16:27:50 +00:00
I am Ole Tange. I am the author of GNU Parallel.
2010-12-19 00:38:36 +00:00
2011-06-05 16:27:50 +00:00
So far GNU Parallel has been focused on replacing a single
for-loop. The Pakistan release introduces a way to replace nested
loops.
2010-11-29 22:59:16 +00:00
2011-06-05 16:27:50 +00:00
# NESTED FOR LOOPS
2010-11-22 09:35:53 +00:00
2011-06-05 16:27:50 +00:00
As example I will use the image manipulation program 'convert' from
ImageMagick. This command will convert foo.png to jpg with a size of
800 and JPEG-quality of 95.
2010-11-22 09:35:53 +00:00
2011-06-05 16:27:50 +00:00
convert -resize 800 -quality 95 foo.png foo_800_q95.jpg
2010-10-26 23:50:58 +00:00
2011-06-05 16:27:50 +00:00
With a for-loop it can be done on a list of files:
2010-10-26 23:50:58 +00:00
2011-06-05 16:27:50 +00:00
for file in *.png ; do
convert -resize 800 -quality 95 $file ${file%.*}_800_q95.jpg
done
2010-10-26 23:50:58 +00:00
2011-06-05 16:27:50 +00:00
This is the kind of loops GNU Parallel is good at replacing:
2010-10-26 23:50:58 +00:00
2011-06-05 16:27:50 +00:00
parallel convert -resize 800 -quality 95 {} {.}_800_q95.jpg ::: *.png
2010-10-26 23:50:58 +00:00
2010-09-14 16:37:26 +00:00
2011-06-05 16:27:50 +00:00
To get the images in 3 different JPEG-qualities you can use a nested for-loop:
2010-09-14 16:37:26 +00:00
2011-06-05 16:27:50 +00:00
for qual in 25 50 95 ; do
for file in *.png ; do
convert -resize 800 -quality $qual $file ${file%.*}_800_q${qual}.jpg
done
done
2010-09-14 16:37:26 +00:00
2011-06-05 16:27:50 +00:00
With GNU Parallel 'Pakistan' you can do this:
2010-09-01 13:26:45 +00:00
2011-06-05 16:27:50 +00:00
parallel convert -resize 800 -quality {2} {1} {1.}_800_q{2}.jpg ::: *.png ::: 25 50 95
2010-08-01 18:09:31 +00:00
2011-06-05 16:27:50 +00:00
The new is that you can use the ::: multiple times. GNU Parallel will
then generate all the combinations and execute the command with these.
The {1} and {2} will be replaced by the relevant input source.
2010-07-18 02:17:49 +00:00
2011-06-05 16:27:50 +00:00
To get the 3 different JPEG-qualities in 2 different sizes you can
nest the for-loop even further:
2010-07-18 02:17:49 +00:00
2011-06-05 16:27:50 +00:00
for size in 800 30 ; do
for qual in 25 50 95 ; do
for file in *.png ; do
convert -resize $size -quality $qual $file ${file%.*}_${size}_q${qual}.jpg
done
done
done
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
With GNU Parallel 'Pakistan' you can do this:
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg ::: *.png ::: 25 50 95 ::: 800 30
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
GNU Parallel will again generate all the combinations of the input
sources and run the jobs in parallel.
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
You can also provide the arguments in a file. This will do the same as above:
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
(echo 25; echo 50; echo 95) > qualities
ls *.png > png-files
(echo 800; echo 30) > sizes
parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg :::: png-files qualities sizes
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
But you can even mix triple and quad colon. These will do the same:
2010-07-18 16:04:07 +00:00
2011-06-05 16:27:50 +00:00
parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg :::: png-files ::: 25 50 95 ::: 800 30
parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg :::: png-files ::: 25 50 95 :::: sizes
The special file '-' reads from standard input. This will do the same as above:
ls *.png | parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg :::: - ::: 25 50 95 :::: sizes
This is probably one of the ways you will use this feature as that can easily be combined with find:
find . -name '*.png' | \
parallel convert -resize {3} -quality {2} {1} {1.}_{3}_q{2}.jpg :::: - ::: 25 50 95 :::: sizes
# Thank you for watching
#
# If you like GNU Parallel:
2011-10-10 20:20:18 +00:00
# * Post this video on Reddit/forums/blogs/Identi.ca/Google+/Twitter/Facebook/Linkedin
2011-06-05 16:27:50 +00:00
# * Join the mailing list http://lists.gnu.org/mailman/listinfo/parallel
# * Request or write a review for your favourite blog or magazine
# * Request or build a package for your favourite distribution
# * Invite me for your next conference (Contact http://ole.tange.dk)
#
# If GNU Parallel saves you money:
# * (Have your company) donate to FSF https://my.fsf.org/donate/
#
# If you use GNU Parallel for a publication please cite:
# O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
# The USENIX Magazine, February 2011:42-47.
#
# Find GNU Parallel at http://www.gnu.org/software/parallel/
2010-07-18 16:04:07 +00:00
2010-06-14 22:05:47 +00:00
2011-02-02 15:36:29 +00:00
=head1 YouTube video --pipe
cp parallel.fasta parallel.mbox lucene.tar
# GNU Parallel 20110205 - The FOSDEM Release
I assume you already know GNU Parallel. If not watch the intro video first.
GNU Parallel has so far worked similar to xargs. But the FOSDEM
release of GNU Parallel introduces the new --pipe option. It makes GNU
Parallel work similar to tee.
tee pipes a copy of the output to a file and a copy to another
program.
seq 1 5 | tee myfile | wc
Here it pipes a copy to the file myfile and to the command word count (wc).
cat myfile
and we can see the content is what we expected.
The pipe option of GNU Parallel splits data into records and pipes a
block of records into a program:
seq 1 5 | parallel --pipe -N1 cat';' echo foo
Here we pipe each number to the command cat and print foo after
running cat.
GNU Parallel does this in parallel starting one process per cpu so the
order may be different because one command may finish before another.
# RECORD SEPARATORS
GNU Parallel splits on record separators.
seq 1 5 | parallel --pipe --recend '\n' -N1 cat';' echo foo
This is the example we saw before: the record separator is \n and
--recend will keep the record separator at the end of the record.
But if what your records start with a record separator? Here is a
fast-a file:
cat parallel.fasta
Every record start with a >. To keep that with the record you use
--recstart:
cat parallel.fasta | parallel --pipe --recstart '>' -N1 cat';' echo foo
But what if you have both? mbox files is an example that has both an
ending and starting separator:
cat parallel.mbox |
parallel --pipe --recend '\n\n' --recstart 'From ' -N1 cat';' echo foo | less #
The two newlines are staying with the email before and the From_ stays with the next record.
GNU Parallel cannot guarantee the first record will start with record
separator and it cannot guarantee the last record will end with record
separator. You will simply get what is first and last.
But GNU Parallel _does_ guarantee that it will only split at record
separators.
# NUMBER OF RECORDS
So far we have used -N1. This tells GNU Parallel to pipe one record to
the program.
seq 1 5 | parallel --pipe -N1 cat';' echo foo
But we can choose any amount:
seq 1 5 | parallel --pipe -N3 cat';' echo foo
This will pipe blocks of 3 records into cat and if there is not enough the last will only get two.
# BLOCKSIZE
However, using -N is inefficient. It is faster to pipe a full block into the program.
cat /usr/share/dict/words | parallel --pipe --blocksize 500k wc
We here tell GNU Parallel to split on \n and pipe blocks of 500 KB to
wc. 1 MB is the default:
cat /usr/share/dict/words | parallel --pipe wc
If you just have a bunch of bytes you often do not care about the
record separator. To split input into chunks you can disable the
--recend
ls -l lucene.tar
cat lucene.tar | parallel --pipe --recend '' -k gzip > lucene.tar.gz
GNU Parallel will then split the input into 1 MB blocks; pipe that to
gzip and -k will make sure the order of the output is kept before
saving to the tar.gz file.
The beauty of gzip is that if you concatenate two gzip files it is a
valid gzip file. To test this:
tar tvzf lucene.tar.gz #
# OUTPUT AS FILE
Sometimes the output of GNU Parallel cannot be mixed in a single stream like this:
seq 1 10 | shuf | parallel --pipe -N 3 sort -n
As you can see each block of 3 is sorted but the whole output is not sorted.
GNU Parallel can give the output in file. GNU Parallel will the list the
files created:
seq 1 10 | shuf | parallel --pipe --files -N 3 sort -n
Each of these files contains a sorted block:
cat
Sort has -m to merge sorted files into a sorted stream
seq 1 10 | shuf | parallel --pipe --files -N 3 sort -n | parallel -mj1 sort -nm
-m will append all the files behind the sort command and the -j1 will
make sure we only run one command. The only part missing now is
cleaning up by removing the temporary files. We can do that by
appending rm
seq 1 10 | shuf | parallel --pipe --files -N 3 sort -n |
parallel -mj1 sort -nm {} ";"rm {}
# Thank you for watching
#
# If you like GNU Parallel:
# * Post this video on forums/blogs/Twitter/Facebook/Linkedin
# * Join the mailing list http://lists.gnu.org/mailman/listinfo/parallel
# * Request or write a review for your favourite magazine
# * Request or build a package for your favourite distribution
# * Invite me for your next conference (Contact http://ole.tange.dk)
#
# If GNU Parallel saves you money:
# * (Have your company) donate to FSF https://my.fsf.org/donate/
#
2011-03-14 16:37:30 +00:00
# If you use GNU Parallel for a publication please cite:
# O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login:
# The USENIX Magazine, February 2011:42-47.
#
2011-02-02 15:36:29 +00:00
# Find GNU Parallel at http://www.gnu.org/software/parallel/
2010-07-09 12:53:56 +00:00
=head1 YouTube video2
Converting of WAV files to MP3 using GNU Parallel
# Run one jobs per CPU core
# For 'foo.wav' call the output file 'foo.mp3'
find music-files -type f | parallel -j+0 lame {} -o {.}.mp3
# Run one jobs per CPU core
# Run on local computer + 2 remote computers
# Give us progress information
# For 'foo.wav' call the output file 'foo.mp3'
2010-09-21 20:00:30 +00:00
find music-files -type f | parallel -j+0 -S :,server1,server2 \
2010-07-09 12:53:56 +00:00
--eta --trc {.}.mp3 lame {} -o {.}.mp3
2010-08-21 23:29:26 +00:00
# Colsep
# sem
2010-09-21 20:00:30 +00:00
# --retry
2010-08-21 23:29:26 +00:00
2010-10-05 20:22:52 +00:00
(echo a1.txt; echo b1.txt; echo c1.txt; echo a2.txt; echo b2.txt; echo c2.txt;)| \
parallel -X -N 3 my-program --file={}
(echo a1.txt; echo b1.txt; echo c1.txt; echo d1.txt; echo e1.txt; echo f1.txt;)| \
parallel -X my-program --file={}
# First job controls the tty
# -u needed because output should not be saved for later
find . -type f | parallel -uXj1 vim
find . -type f | parallel -uXj1 emacs
# If you have 1000 files only one contains 'foobar'
# stop when this one is found
find . -type f | parallel grep -l foobar | head -1
# To test a list of hosts are up and pingable save this
# to a file called machinesup
#!/usr/bin/parallel --shebang --no-run-if-empty ping -c 3 {} >/dev/null 2>&1
google.com
yahoo.com
nowhere.gone
# Then:
# chmod 755 machinesup
# ./machinesup || echo Some machines are down
2010-07-09 12:53:56 +00:00
2010-06-12 23:24:25 +00:00
=head1 YouTube video
GNU Parallel is a tool with lots of uses in shell. Every time you use
xargs or a for-loop GNU Parallel can probably do that faster, safer
and more readable.
If you have access to more computers through ssh, GNU Parallel makes
it easy to distribute jobs to these.
terminal2: ssh parallel@vh2.pi.dk
ssh parallel@vh2.pi.dk
2011-02-02 15:36:29 +00:00
and
2010-06-12 23:24:25 +00:00
2010-06-16 03:03:52 +00:00
PS1="\[\e[7m\]GNU Parallel:\[\033[01;34m\]\w\[\033[00m\e[27m\]$ "
2010-06-12 23:24:25 +00:00
gunzip logs/*gz
2010-06-16 03:03:52 +00:00
rm -f logs/*bz2*
2010-06-12 23:24:25 +00:00
rm -rf zip/*[^p]
2010-06-16 03:03:52 +00:00
rm -rf dirs/*
rm -rf parallel-*bz2
xvidcap
2010-06-22 13:24:55 +00:00
ffmpeg -i 20100616_002.mp4 -ab 320k -ar 44100 speak.mp3
2010-06-16 03:03:52 +00:00
# Merge video using youtube
#ffmpeg -i speak.mp3 -i xvidcap.mpeg -target mpeg -hq -minrate 8000000 \
2010-06-22 13:24:55 +00:00
#-title "GNU Parallel" -author "Ole Tange" -copyright "(CC-By-SA) 2010" -comment "Intro video of GNU Parallel 20100616" videoaudio.mpg
2010-06-16 03:03:52 +00:00
# GNU PARALLEL - BASIC USAGE
# A GNU tool for parallelizing shell commands
2010-06-12 23:24:25 +00:00
2010-06-22 13:24:55 +00:00
## Ole Tange Author
2010-06-12 23:24:25 +00:00
# GET GNU PARALLEL
2010-06-22 13:24:55 +00:00
wget ftp://ftp.gnu.org/gnu/parallel/parallel-20100620.tar.bz2
tar xjf parallel-20100620.tar.bz2
cd parallel-20100620
2010-06-12 23:24:25 +00:00
./configure && make ##
su
make install
exit
cd
2010-06-22 13:24:55 +00:00
## scp /usr/local/bin/parallel root@parallel:/usr/local/bin/
2010-06-12 23:24:25 +00:00
# YOUR FIRST PARALLEL JOBS
cd logs
du
2010-06-16 03:03:52 +00:00
/usr/bin/time gzip -1 *
## 24 sek - 22 sek
2010-06-12 23:24:25 +00:00
/usr/bin/time gunzip *
2010-06-16 03:03:52 +00:00
## 24 sek - 18
ls | time parallel gzip -1
## 17 sek - 10
2010-06-12 23:24:25 +00:00
l s | time parallel gunzip
2010-06-16 03:03:52 +00:00
## 25 sek - 19
2010-06-12 23:24:25 +00:00
# RECOMPRESS gz TO bz2
2010-06-16 03:03:52 +00:00
ls | time parallel gzip -1
2010-06-22 13:24:55 +00:00
ls *.gz | time parallel -j+0 --eta 'zcat {} | bzip2 -9 >{.}.bz2'
2010-06-16 03:03:52 +00:00
## Explain command line
2010-06-12 23:24:25 +00:00
## vis top local
2010-06-16 03:03:52 +00:00
## Man that is boring
2010-06-22 13:24:55 +00:00
## 2m41s - 2m - 3m35s
2010-06-16 03:03:52 +00:00
2010-06-22 13:24:55 +00:00
# RECOMPRESS gz TO bz2 USING local(:) AND REMOTE server1-4
ls *.gz |time parallel -j+0 --eta -Sserver1,server2,server3,server4,: \
--transfer --return {.}.bz2 --cleanup 'zcat {} | bzip2 -9 > {.}.bz2'
2010-06-16 03:03:52 +00:00
## Explain command line
## Explain server config
2010-06-12 23:24:25 +00:00
## vis top local
2010-06-16 03:03:52 +00:00
## vis top remote1-3
## 49 sek
2010-06-22 13:24:55 +00:00
# RECOMPRESS gz TO bz2 USING A SCRIPT ON local AND REMOTE server1-2,4
2010-06-16 03:03:52 +00:00
# (imagine the script is way more complex)
cp ../recompress /tmp
cat /tmp/recompress
2010-06-22 13:24:55 +00:00
ls *.gz |time parallel -j+0 --progress -Sserver1,server2,server4,: \
2010-06-16 03:03:52 +00:00
--trc {.}.bz2 --basefile /tmp/recompress '/tmp/recompress {} {.}.bz2'
2010-06-12 23:24:25 +00:00
# MAKING SMALL SCRIPTS
cd ../zip
ls -l
ls *.zip | parallel 'mkdir {.} && cd {.} && unzip ../{}' ###
ls -l
# GROUP OUTPUT
traceroute debian.org
traceroute debian.org & traceroute freenetproject.org ###
2010-06-16 03:03:52 +00:00
(echo debian.org; echo freenetproject.org) | parallel traceroute ###
2010-06-12 23:24:25 +00:00
# KEEP ORDER
2010-06-16 03:03:52 +00:00
(echo debian.org; echo freenetproject.org) | parallel -k traceroute ###
2010-06-12 23:24:25 +00:00
# RUN MANY JOBS. USE OUTPUT
# Find the number of hosts responding to ping
2010-06-16 03:03:52 +00:00
ping -c 1 178.63.11.1
ping -c 1 178.63.11.1 | grep '64 bytes'
2010-06-12 23:24:25 +00:00
seq 1 255 | parallel -j255 ping -c 1 178.63.11.{} 2>&1 \
| grep '64 bytes' | wc -l
seq 1 255 | parallel -j0 ping -c 1 178.63.11.{} 2>&1 \
| grep '64 bytes' | wc -l
# MULTIPLE ARGUMENTS
2010-06-22 13:24:55 +00:00
# make dir: test-(1-5000).dir
2010-06-12 23:24:25 +00:00
cd ../dirs
rm -rf *; sync
2010-06-16 03:03:52 +00:00
seq 1 10 | parallel echo mkdir test-{}.dir
seq 1 5000 | time parallel mkdir test-{}.dir
## 15 sek
2010-06-12 23:24:25 +00:00
rm -rf *; sync
2010-06-16 03:03:52 +00:00
seq 1 10 | parallel -X echo mkdir test-{}.dir
seq 1 5000 | time parallel -X mkdir test-{}.dir
2010-06-12 23:24:25 +00:00
# CALLING GNU PARALLEL FROM GNU PARALLEL
2010-06-16 03:03:52 +00:00
# make dir: top-(1-100)/sub-(1-100)
2010-06-12 23:24:25 +00:00
rm -rf *; sync
2010-06-16 03:03:52 +00:00
seq 1 100 | time parallel -I @@ \
'mkdir top-@@; seq 1 100 | parallel -X mkdir top-@@/sub-{}'
2010-06-12 23:24:25 +00:00
find | wc -l
2010-06-16 03:03:52 +00:00
cd
# Thank you for watching
#
# If you like GNU Parallel:
# * Post this video on your blog/Twitter/Facebook/Linkedin
# * Join the mailing list http://lists.gnu.org/mailman/listinfo/parallel
# * Request or write a review for your favourite magazine
# * Request or build a package for your favourite distribution
# * Invite me for your next conference (Contact http://ole.tange.dk)
#
# If GNU Parallel saves you money:
# * Donate to FSF https://my.fsf.org/donate/
#
2010-06-12 23:24:25 +00:00
# Find GNU Parallel at http://www.gnu.org/software/parallel/
2010-06-16 03:03:52 +00:00
2010-06-12 23:24:25 +00:00
# GIVE ME THE FIRST RESULT
(echo foss.org.my; echo debian.org; echo freenetproject.org) | parallel -H2 traceroute {}";false"
find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
(echo foss.org.my; echo debian.org; echo freenetproject.org) | parallel traceroute
2010-06-09 22:39:35 +00:00
2010-04-19 07:07:12 +00:00
=head1 IDEAS
2010-06-05 23:03:39 +00:00
Kan vi lave flere ssh'er, hvis vi venter lidt?
En ssh med 20% loss og 900 ms delay, så kan login nås på 15 sek.
2010-04-19 07:07:12 +00:00
Test if -0 works on filenames ending in '\n'
2010-06-22 13:24:55 +00:00
2010-06-09 20:26:59 +00:00
=head1 options
2010-04-19 07:07:12 +00:00
2011-01-02 00:01:21 +00:00
One char options not used: A F G K O Q R Z c f
2010-04-19 07:07:12 +00:00
2010-06-09 20:26:59 +00:00
=head1 Unlikely
Accept signal INT instead of TERM to complete current running jobs but
do not start new jobs. Print out the number of jobs waiting to
complete on STDERR. Accept sig INT again to kill now. This seems to be
hard, as all foreground processes get the INT from the shell.
2010-06-05 23:03:39 +00:00
2010-08-14 18:39:33 +00:00
# Gzip all files in parallel
parallel gzip ::: *
# Convert *.wav to *.mp3 using LAME running one process per CPU core:
parallel -j+0 lame {} -o {.}.mp3 ::: *.wav
# Make an uncompressed version of all *.gz
parallel zcat {} ">"{.} ::: *.gz
# Recompress all .gz files using bzip2 running 1 job per CPU core:
find . -name '*.gz' | parallel -j+0 "zcat {} | bzip2 >{.}.bz2 && rm {}"
# Create a directory for each zip-file and unzip it in that dir
parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
2011-02-02 15:36:29 +00:00
# Convert all *.mp3 in subdirs to *.ogg running
2010-08-14 18:39:33 +00:00
# one process per CPU core on local computer and server2
find . -name '*.mp3' | parallel --trc {.}.ogg -j+0 -S server2,: \
'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg'
# Run mycmd on column 1-3 of each row of TAB separated values
parallel -a table_file.tsv --colsep '\t' mycmd -o {2} {3} -i {1}
# Run traceroute in parallel, but keep the output order the same
parallel -k traceroute ::: foss.org.my debian.org freenetproject.org
2011-02-23 15:01:18 +00:00
Test of signal passing through ssh
#!/bin/bash
SERVER1=parallel-server3
SERVER2=parallel-server2
export BG_PROC
start_remote_sleep() {
parallel -D -u -S parallel@$SERVER2 sleep ::: 370 &
BG_PROC=$!
while ! ssh parallel@$SERVER2 ps -A -o cmd | grep -q '^sleep 370' ; do
sleep 0.3
done
}
stop_local_parallel() {
kill -9 $BG_PROC
}
check_and_stop_remote_sleep() {
ssh parallel@$SERVER2 ps -A -o cmd | grep '^sleep 370'
ssh parallel@$SERVER2 killall sleep
}
echo '### Test kill signals'
start_remote_sleep 2>/dev/null
kill -1 $BG_PROC
check_and_stop_remote_sleep
sub propagate_signal {
my $signal = shift;
# $signal = "KILL";
::debug("Sending $signal to ",keys %Global::running);
kill $signal, keys %Global::running;
if(defined $Global::original_sig{$signal}) {
&{$Global::original_sig{$signal}};
}
}
my %do_not_propagate = map { $_ => 1 } qw(TTOU TTIN CONT TSTP __WARN__ __DIE__);
for (keys %SIG) {
$do_not_propagate{$_} and next;
$SIG{$_} = eval 'sub { propagate_signal("'.$_.'"); };';
}