--arg-sep and ::: implemented.

This commit is contained in:
Ole Tange 2010-07-09 14:10:22 +02:00
parent 39d1c6bfa6
commit ccd17d35c5
6 changed files with 230 additions and 52 deletions

View file

@ -75,6 +75,29 @@ B<{.}> can be used the same places as B<{}>. The replacement string
B<{.}> can be changed with B<-U>.
=item B<:::> (beta testing)
Use arguments on the command line as input instead of stdin (standard
input). Unlike other options for GNU B<parallel> B<:::> is placed
after the command and before the arguments.
The following are equivalent:
(echo file1; echo file2) | parallel gzip
parallel gzip ::: file1 file2
parallel gzip {} ::: file1 file2
parallel --arg-sep ,, gzip {} ,, file1 file2
parallel --arg-sep ,, gzip ,, file1 file2
parallel ::: "gzip file1" "gzip file2"
To avoid treating B<:::> as special use B<--arg-sep> to set the
argument separator to something else. See also B<--arg-sep>.
stdin (standard input) will be passed to the first process run.
If B<--arg-file> is set arguments from that file will be appended.
=item B<--null>
=item B<-0>
@ -88,14 +111,26 @@ for processing arguments that may contain \n (newline).
=item B<-a> I<input-file>
Read items from the file I<input-file> instead of standard input. If
Read items from the file I<input-file> instead of stdin (standard input). If
you use this option, stdin is given to the first process run.
Otherwise, stdin is redirected from /dev/null.
=item B<--basefile> I<file> (beta testing)
=item B<--arg-sep> I<sep-str> (beta testing)
=item B<-B> I<file> (beta testing)
Use I<sep-str> instead of B<:::> as separator string. Useful if B<:::>
is used for something else by the command.
Also useful if you command uses B<:::> but you still want to read
arguments from stdin (standard input): Simply change B<--arg-sep> to a
string that is not in the command line.
See also: B<:::>.
=item B<--basefile> I<file>
=item B<-B> I<file>
I<file> will be transferred to each sshlogin before a jobs is
started. It will be removed if B<--cleanup> is active. The file may be
@ -165,7 +200,7 @@ If I<eof-str> is omitted, there is no end of file string. If neither
B<-E> nor B<-e> is used, no end of file string is used.
=item B<--eta> (beta testing)
=item B<--eta>
Show the estimated number of seconds before finishing. This forces GNU
B<parallel> to read all jobs before starting to find the number of
@ -197,9 +232,9 @@ B<-g> is the default. Can be reversed with B<-u>.
Print a summary of the options to GNU B<parallel> and exit.
=item B<--halt-on-error> <0|1|2> (beta testing)
=item B<--halt-on-error> <0|1|2>
=item B<-H> <0|1|2> (beta testing)
=item B<-H> <0|1|2>
=over 3
@ -394,7 +429,7 @@ default.
=item B<-r>
If the standard input only contains whitespace, do not run the command.
If the stdin (standard input) only contains whitespace, do not run the command.
=item B<--return> I<filename>
@ -452,9 +487,9 @@ operating system and the B<-s> option. Pipe the input from /dev/null
to do anything.
=item B<-S> I<[ncpu/]sshlogin[,[ncpu/]sshlogin[,...]]> (beta testing)
=item B<-S> I<[ncpu/]sshlogin[,[ncpu/]sshlogin[,...]]>
=item B<--sshlogin> I<[ncpu/]sshlogin[,[ncpu/]sshlogin[,...]]> (beta testing)
=item B<--sshlogin> I<[ncpu/]sshlogin[,[ncpu/]sshlogin[,...]]>
Distribute jobs to remote servers. The jobs will be run on a list of
remote servers. GNU B<parallel> will determine the number of CPU
@ -482,7 +517,7 @@ The remote host must have GNU B<parallel> installed.
B<--sshlogin> is known to cause problems with B<-m> and B<-X>.
=item B<--sshloginfile> I<filename> (beta testing)
=item B<--sshloginfile> I<filename>
File with sshlogins. The file consists of sshlogins on separate
lines. Empty lines and lines starting with '#' are ignored. Example:
@ -744,6 +779,10 @@ job per CPU core in parallel:
B<ls *.gz | parallel -j+0 "zcat {} | bzip2 >>B<{.}.bz2 && rm {}">
Convert all WAV files to MP3 using LAME:
B<find sounddir -type f -name '*.wav' | parallel -j+0 lame {} -o {.}.mp3>
=head1 EXAMPLE: Removing two file extensions when processing files and
calling GNU Parallel from itself
@ -752,7 +791,7 @@ If you have directory with tar.gz files and want these extracted in
the corresponding dir (e.g foo.tar.gz will be extracted in the dir
foo) you can do:
B<ls *.tar.gz| parallel -U /// 'echo ///|parallel "mkdir -p {.} ; tar -C {.} -xf {.}.tar.gz"'>
B<ls *.tar.gz| parallel -U {tar} 'echo {tar}|parallel "mkdir -p {.} ; tar -C {.} -xf {.}.tar.gz"'>
=head1 EXAMPLE: Rewriting a for-loop and a while-loop
@ -1448,6 +1487,25 @@ B<seq 1 19 | parallel -j+0 buffon -o - | sort -n >>B< result>
B<cat files | parallel -j+0 cmd>
=head2 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
ClusterSSH solves a different problem than GNU B<parallel>.
ClusterSSH runs the same command with the same arguments on a list of
machines - one per machine. This is typically used for administrating
several machines that are almost identical.
GNU B<parallel> runs the same (or different) commands with different
arguments in parallel possibly using remote machines to help
computing. If more than one machine is listed in -S GNU B<parallel> may
only use one of these (e.g. if there are 8 jobs to be run and one
machine has 8 cores).
GNU B<parallel> can be used as a poor-mans version of ClusterSSH:
B<cat hostlist | parallel ssh {} do_stuff>
=head1 BUGS
Filenames beginning with '-' can cause some commands to give
@ -1619,7 +1677,7 @@ if($::opt_halt_on_error) {
sub parse_options {
# Returns: N/A
# Defaults:
$Global::version = 20100620;
$Global::version = 20100705;
$Global::progname = 'parallel';
$Global::debug = 0;
$Global::verbose = 0;
@ -1638,6 +1696,7 @@ sub parse_options {
$Global::halt_on_error_exitstatus = 0;
$Global::total_jobs = 0;
$Global::eta = 0;
$Global::arg_sep = ":::";
Getopt::Long::Configure ("bundling","require_order");
# Add options from .parallelrc
@ -1683,6 +1742,7 @@ sub parse_options {
"halt-on-error|H=s" => \$::opt_halt_on_error,
"progress" => \$::opt_progress,
"eta" => \$::opt_eta,
"arg-sep|argsep=s" => \$::opt_arg_sep,
# xargs-compatibility - implemented, man, unittest
"max-procs|P=s" => \$::opt_P,
"delimiter|d=s" => \$::opt_d,
@ -1729,6 +1789,7 @@ sub parse_options {
}
if(defined $::opt_n and $::opt_n) { $Global::max_number_of_args = $::opt_n; }
if(defined $::opt_help) { die_usage(); }
if(defined $::opt_arg_sep) { $Global::arg_sep = $::opt_arg_sep; }
if(defined $::opt_number_of_cpus) { print no_of_cpus(),"\n"; exit(0); }
if(defined $::opt_number_of_cores) { print no_of_cores(),"\n"; exit(0); }
if(defined $::opt_max_line_length_allowed) { print real_max_length(),"\n"; exit(0); }
@ -1750,8 +1811,29 @@ sub parse_options {
# so default to -X
$Global::Xargs = 1;
}
if(grep /^$Global::arg_sep$/o, @ARGV) {
# Arguments on the command line.
# Ignore STDIN by reading from /dev/null
# or another file if user has given --arg-file
$::opt_a ||= "/dev/null";
# Input: @ARGV = command option ::: arg arg arg
my @new_argv = ();
while(@ARGV) {
my $arg = shift @ARGV;
if($arg =~ /^$Global::arg_sep$/o) {
unget_arg(@ARGV);
@ARGV=();
} else {
push @new_argv, $arg;
}
}
# Output: @ARGV = command option
@ARGV=@new_argv;
}
if(defined $::opt_a) {
# must be done after opt_arg_sep
if(not open(ARGFILE,"<",$::opt_a)) {
print STDERR "$Global::progname: ".
"Cannot open input file `$::opt_a': ".

View file

@ -0,0 +1,32 @@
### Test basic --arg-sep
a
b
### Run commands using --arg-sep
echo a
a
echo b
b
### Change --arg-sep
echo a
a
echo b
b
echo a
a
echo b
b
### Test stdin goes to first command only
cat -
via first cat
cat -
cat
via cat
echo b
b
cat
via cat
echo b
b
echo a
a
cat

View file

@ -1,72 +1,75 @@
#!/bin/bash
PAR=parallel
# Test {.}
echo '### Test weird regexp chars'
seq 1 6 | $PAR -j1 -I :: -X echo a::b::^c::[.}c
seq 1 6 | parallel -j1 -I :: -X echo a::b::^c::[.}c
rsync -Ha --delete input-files/testdir2/ tmp/
cd tmp
echo '### Test {.} and {}'
find . -name '*.jpg' | $PAR -j +0 convert -geometry 120 {} {.}_thumb.jpg
find . -name '*.jpg' | parallel -j +0 convert -geometry 120 {} {.}_thumb.jpg
echo '### Test {.} with files that have no . but dir does'
mkdir -p /tmp/test-of-{.}-parallel/subdir
touch /tmp/test-of-{.}-parallel/subdir/file
touch /tmp/test-of-{.}-parallel/subdir/file{.}.funkyextension}}
find /tmp/test-of-{.}-parallel -type f | $PAR echo {.} | sort
find /tmp/test-of-{.}-parallel -type f | parallel echo {.} | sort
rm -rf /tmp/test-of-{.}-parallel/subdir
find -type f | $PAR -k diff {} a/foo ">"{.}.diff
ls | $PAR -kvg "ls {}|wc;echo {}"
ls | $PAR -kj500 'sleep 1; ls {} | perl -ne "END{print $..\" {}\n\"}"'
ls | $PAR -kgj500 'sleep 1; ls {} | perl -ne "END{print $..\" {}\n\"}"'
find -type f | parallel -k diff {} a/foo ">"{.}.diff
ls | parallel -kvg "ls {}|wc;echo {}"
ls | parallel -kj500 'sleep 1; ls {} | perl -ne "END{print $..\" {}\n\"}"'
ls | parallel -kgj500 'sleep 1; ls {} | perl -ne "END{print $..\" {}\n\"}"'
mkdir 1-col 2-col
ls | $PAR -kv touch -- {.}/abc-{.}-{} 2>&1
ls | $PAR -kv rm -- {.}/abc-{.}-{} 2>&1
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | $PAR -0 -v touch -- {}/abc-{}-{} 2>&1 \
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | $PAR -0 -v rm -- {}/abc-{}-{} 2>&1 \
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | $PAR -0 -v rmdir -- {} 2>&1 \
ls | parallel -kv touch -- {.}/abc-{.}-{} 2>&1
ls | parallel -kv rm -- {.}/abc-{.}-{} 2>&1
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | parallel -0 -v touch -- {}/abc-{}-{} 2>&1 \
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | parallel -0 -v rm -- {}/abc-{}-{} 2>&1 \
#test05.sh:find . -type d -print0 | perl -0 -pe 's:^./::' | parallel -0 -v rmdir -- {} 2>&1 \
echo '### Test -m'
(echo foo;echo bar;echo joe.gif) | $PAR -km echo 1{}2{.}3 A{.}B{.}C
(echo foo;echo bar;echo joe.gif) | $PAR -kX echo 1{}2{.}3 A{.}B{.}C
seq 1 6 | $PAR -k printf '{}.gif\\n' | $PAR -km echo a{}b{.}c{.}
seq 1 6 | $PAR -k printf '{}.gif\\n' | $PAR -kX echo a{}b{.}c{.}
(echo foo;echo bar;echo joe.gif) | parallel -km echo 1{}2{.}3 A{.}B{.}C
(echo foo;echo bar;echo joe.gif) | parallel -kX echo 1{}2{.}3 A{.}B{.}C
seq 1 6 | parallel -k printf '{}.gif\\n' | parallel -km echo a{}b{.}c{.}
seq 1 6 | parallel -k printf '{}.gif\\n' | parallel -kX echo a{}b{.}c{.}
echo '### Test -m with 60000 args'
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -km echo a{}b{.}c{.} | mop -d 4 "|md5sum" "| wc"
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -km echo a{}b{.}c{.} | mop -d 4 "|md5sum" "| wc"
echo '### Test -X with 60000 args'
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b{.}c{.} | mop -d 4 "|md5sum" "| wc"
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b{.}c{.} | mop -d 4 "|md5sum" "| wc"
echo '### Test -X with 60000 args and 5 expansions'
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b{.}c{.}{.}{} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b{.}c{.}{.} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b{.}c{.} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b{.}c | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | $PAR -kX echo a{}b | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b{.}c{.}{.}{} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b{.}c{.}{.} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b{.}c{.} | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b{.}c | wc -l
seq 1 60000 | perl -pe 's/$/.gif\n/' | parallel -kX echo a{}b | wc -l
echo '### Test {.} does not repeat more than {}'
seq 1 15 | perl -pe 's/$/.gif\n/' | parallel -s 80 -kX echo a{}b{.}c{.}
seq 1 15 | perl -pe 's/$/.gif\n/' | parallel -s 80 -km echo a{}b{.}c{.}
seq 1 15 | perl -pe 's/$/.gif/' | parallel -s 80 -kX echo a{}b{.}c{.}
seq 1 15 | perl -pe 's/$/.gif/' | parallel -s 80 -km echo a{}b{.}c{.}
echo '### Test -I with shell meta chars'
seq 1 60000 | $PAR -I :: -X echo a::b::c:: | wc -l
seq 1 60000 | $PAR -I '<>' -X echo 'a<>b<>c<>' | wc -l
seq 1 60000 | $PAR -I '<' -X echo 'a<b<c<' | wc -l
seq 1 60000 | $PAR -I '>' -X echo 'a>b>c>' | wc -l
seq 1 60000 | parallel -I :: -X echo a::b::c:: | wc -l
seq 1 60000 | parallel -I '<>' -X echo 'a<>b<>c<>' | wc -l
seq 1 60000 | parallel -I '<' -X echo 'a<b<c<' | wc -l
seq 1 60000 | parallel -I '>' -X echo 'a>b>c>' | wc -l
echo '### Test {.}'
echo a | $PAR -qX echo "'"{.}"' "
echo a | $PAR -qX echo "'{.}'"
(echo "sleep 3; echo begin"; seq 1 30 | $PAR -kq echo "sleep 1; echo {.}"; echo "echo end") \
| $PAR -k -j0
echo a | parallel -qX echo "'"{.}"' "
echo a | parallel -qX echo "'{.}'"
(echo "sleep 3; echo begin"; seq 1 30 | parallel -kq echo "sleep 1; echo {.}"; echo "echo end") \
| parallel -k -j0
echo '### Test -I with -X and -m'
seq 1 10 | $PAR -k 'seq 1 {.} | '$PAR' -k -I :: echo {.} ::'
seq 1 10 | $PAR -k 'seq 1 {.} | '$PAR' -X -k -I :: echo a{.} b::'
seq 1 10 | $PAR -k 'seq 1 {.} | '$PAR' -m -k -I :: echo a{.} b::'
seq 1 10 | parallel -k 'seq 1 {.} | 'parallel' -k -I :: echo {.} ::'
seq 1 10 | parallel -k 'seq 1 {.} | 'parallel' -X -k -I :: echo a{.} b::'
seq 1 10 | parallel -k 'seq 1 {.} | 'parallel' -m -k -I :: echo a{.} b::'
echo '### Test -i'
(echo a; echo END; echo b) | $PAR -k -i -eEND echo repl{.}ce
(echo a; echo END; echo b) | parallel -k -i -eEND echo repl{.}ce
echo '### Test --replace'
(echo a; echo END; echo b) | $PAR -k --replace -eEND echo repl{.}ce
(echo a; echo END; echo b) | parallel -k --replace -eEND echo repl{.}ce
echo '### Test -t'
(echo b; echo c; echo f) | $PAR -k -t echo {.}ar 2>&1 >/dev/null
(echo b; echo c; echo f) | parallel -k -t echo {.}ar 2>&1 >/dev/null
echo '### Test --verbose'
(echo b; echo c; echo f) | $PAR -k --verbose echo {.}ar 2>&1 >/dev/null
(echo b; echo c; echo f) | parallel -k --verbose echo {.}ar 2>&1 >/dev/null

View file

@ -0,0 +1,14 @@
#!/bin/bash
echo '### Test basic --arg-sep'
parallel -k echo ::: a b
echo '### Run commands using --arg-sep'
parallel -kv ::: 'echo a' 'echo b'
echo '### Change --arg-sep'
parallel --arg-sep ::: -kv ::: 'echo a' 'echo b'
parallel --arg-sep .--- -kv .--- 'echo a' 'echo b'
echo '### Test stdin goes to first command only'
echo via first cat |parallel -kv cat ::: - -
echo via cat |parallel --arg-sep .--- -kv .--- 'cat' 'echo b'
echo via cat |parallel -kv ::: 'cat' 'echo b'
echo no output |parallel -kv ::: 'echo a' 'cat'

View file

@ -74,6 +74,21 @@ a1.gifb1c1 a2.gifb2c2 a3.gifb3c3 a4.gifb4c4 a5.gifb5c5 a6.gifb6c6
13
10
7
### Test {.} does not repeat more than {}
a1.gifb1c1 abc a2.gifb2c2 abc a3.gifb3c3 abc a4.gifb4c4 abc a5.gifb5c5 abc
a6.gifb6c6 abc a7.gifb7c7 abc a8.gifb8c8 abc a9.gifb9c9 abc a10.gifb10c10
abc a11.gifb11c11 abc a12.gifb12c12 abc a13.gifb13c13 abc a14.gifb14c14
abc a15.gifb15c15 abc
a1.gif 2.gif 3.gif 4.gif 5.gif b1 2 3 4 5 6c1 2 3 4 5 6
a6.gif 7.gif 8.gif 9.gif 10.gif b6 7 8 9 10 11c6 7 8 9 10 11
a11.gif 12.gif 13.gif 14.gif b11 12 13 14 15c11 12 13 14 15
a15.gif b15 c15
a1.gifb1c1 a2.gifb2c2 a3.gifb3c3 a4.gifb4c4 a5.gifb5c5 a6.gifb6c6
a7.gifb7c7 a8.gifb8c8 a9.gifb9c9 a10.gifb10c10 a11.gifb11c11 a12.gifb12c12
a13.gifb13c13 a14.gifb14c14 a15.gifb15c15
a1.gif 2.gif 3.gif 4.gif 5.gif 6.gif 7.gifb1 2 3 4 5 6 7 8c1 2 3 4 5 6 7 8
a8.gif 9.gif 10.gif 11.gif 12.gif 13.gifb8 9 10 11 12 13 14c8 9 10 11 12 13 14
a14.gif 15.gifb14 15c14 15
### Test -I with shell meta chars
9
9

View file

@ -0,0 +1,32 @@
### Test basic --arg-sep
a
b
### Run commands using --arg-sep
echo a
a
echo b
b
### Change --arg-sep
echo a
a
echo b
b
echo a
a
echo b
b
### Test stdin goes to first command only
cat -
via first cat
cat -
cat
via cat
echo b
b
cat
via cat
echo b
b
echo a
a
cat