Wrote missing man for xargs compatability.

Fixed bug in --arg-file.
Implemented --show-limits.
This commit is contained in:
Ole Tange 2010-04-21 21:28:00 +02:00
parent 3e50ba19cf
commit 495d8bc0bd
4 changed files with 271 additions and 74 deletions

View file

@ -57,6 +57,35 @@ Use NUL as delimiter. Normally input lines will end in \n
for processing filenames that may contain \n (newline). for processing filenames that may contain \n (newline).
=item B<--arg-file>=I<file>
=item B<-a> I<file>
Read items from file instead of standard input. If you use this
option, stdin is given to the first process run. Otherwise, stdin is
redirected from /dev/null.
=item B<--cleanup> (not implemented)
Remove transfered files. B<--cleanup> will remove the transfered files
on the remote server after processing is done.
find log -name '*gz' | parallel \
--sshlogin server.example.com --transfer --return {.}.bz2 \
--cleanup "zcat {} | bzip -9 >{.}.bz2"
With B<--transfer> the file transfered to the remote server will be
removed on the remote server. Directories created will not be removed
- even if they are empty.
With B<--return> the file transfered from the remote server will be
removed on the remote server. Directories created will not be removed
- even if they are empty.
B<--cleanup> is ignored when not used with B<--transfer> or B<--return>.
=item B<--command> =item B<--command>
=item B<-c> =item B<-c>
@ -81,6 +110,24 @@ as \n, or an octal or hexadecimal escape code. Octal and
hexadecimal escape codes are understood as for the printf command. hexadecimal escape codes are understood as for the printf command.
Multibyte characters are not supported. Multibyte characters are not supported.
=item B<-E> I<eof-str>
Set the end of file string to eof-str. If the end of file string
occurs as a line of input, the rest of the input is ignored. If
neither B<-E> nor B<-e> is used, no end of file string is used.
=item B<--eof>[=I<eof-str>]
=item B<-e>[I<eof-str>]
This option is a synonym for the B<-E> option. Use B<-E> instead,
because it is POSIX compliant for B<xargs> while this option is not.
If I<eof-str> is omitted, there is no end of file string. If neither
B<-E> nor B<-e> is used, no end of file string is used.
=item B<--file> =item B<--file>
@ -99,17 +146,25 @@ Group output. Output from each jobs is grouped together and is only
printed when the command is finished. STDERR first followed by STDOUT. printed when the command is finished. STDERR first followed by STDOUT.
B<-g> is the default. Can be reversed with B<-u>. B<-g> is the default. Can be reversed with B<-u>.
=item B<--help>
=item B<-h>
Print a summary of the options to B<parallel> and exit.
=item B<-I> I<string> =item B<-I> I<string>
Use the replacement string I<string> instead of {}. Use the replacement string I<string> instead of {}.
=item B<-U> I<string> =item B<--replace>[=I<replace-str>]
=item B<--extensionreplace> I<string> =item B<-i>[I<replace-str>]
Use the replacement string I<string> instead of {.} for input line without extension. This option is a synonym for B<-I>I<replace-str> if I<replace-str> is
specified, and for B<-I>{} otherwise. This option is deprecated;
use B<-I> instead.
=item B<--jobs> I<N> =item B<--jobs> I<N>
@ -170,6 +225,16 @@ If the evaluated number is less than 1 then 1 will be used. See also
Keep sequence of output same as the order of input. If jobs 1 2 3 4 Keep sequence of output same as the order of input. If jobs 1 2 3 4
end in the sequence 3 1 4 2 the output will still be 1 2 3 4. end in the sequence 3 1 4 2 the output will still be 1 2 3 4.
=item B<--max-args>=I<max-args>
=item B<-n> I<max-args>
Use at most I<max-args> arguments per command line. Fewer than
I<max-args> arguments will be used if the size (see the B<-s> option)
is exceeded, unless the B<-x> option is given, in which case
B<parallel> will exit.
Only used with B<-m> and B<-X>.
=item B<--number-of-cpus> =item B<--number-of-cpus>
@ -193,6 +258,75 @@ QUOTING. Most people will never need this. Quoting is disabled by
default. default.
=item B<--interactive>
=item B<-p>
Prompt the user about whether to run each command line and read a line
from the terminal. Only run the command line if the response starts
with 'y' or 'Y'. Implies B<-t>.
=item B<--no-run-if-empty>
=item B<-r>
If the standard input does not contain any nonblanks, do not run the
command.
=item B<--return> I<filename> (not implemented)
Transfer files from remote servers. B<--return> is used with
B<--sshlogin> when the arguments are files on the remote servers. When
processing is done the file I<filename> will be transfered
from the remote server using B<rsync> and will be put relative to
the default login dir. E.g.
echo foo/bar.txt | parallel \
--sshlogin server.example.com --return {}.out touch {}.out
This will transfer the file I<$HOME/foo/bar.txt.out> from the server
I<server.example.com> to the file I<foo/bar.txt.out> after running
B<touch foo/bar.txt.out> on I<server.example.com>.
echo /tmp/foo/bar.txt | parallel \
--sshlogin server.example.com --return {}.out touch {}.out
This will transfer the file I</tmp/foo/bar.txt.out> from the server
I<server.example.com> to the file I</tmp/foo/bar.txt.out> after running
B<touch /tmp/foo/bar.txt.out> on I<server.example.com>.
Multiple files can be transfered by repeating the options multiple
times:
echo /tmp/foo/bar.txt | \
parallel --sshlogin server.example.com \
--return {}.out --return {}.out2 touch {}.out {}.out2
B<--return> is often used with B<--transfer> and B<--cleanup>.
B<--return> is ignored when used with B<--sshlogin :> or when not used with B<--sshlogin>.
=item B<--max-chars>=I<max-chars>
=item B<-s> I<max-chars>
Use at most max-chars characters per command line, including the
command and initial-arguments and the terminating nulls at the ends of
the argument strings. The largest allowed value is system-dependent,
and is calculated as the argument length limit for exec, less the size
of your environment. The default value is the maximum.
=item B<--show-limits>
Display the limits on the command-line length which are imposed by the
operating system and the -s option. Pipe the input from /dev/null
(and perhaps specify --no-run-if-empty) if you don't want B<parallel>
to do anything.
=item B<-S> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented) =item B<-S> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented)
=item B<--sshlogin> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented) =item B<--sshlogin> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented)
@ -240,6 +374,16 @@ Silent. The job to be run will not be printed. This is the default.
Can be reversed with B<-v>. Can be reversed with B<-v>.
=item B<--verbose>
=item B<-t>
Print the command line on the standard error output before executing
it.
See also B<-v>.
=item B<--transfer> (not implemented) =item B<--transfer> (not implemented)
Transfer files to remote servers. B<--transfer> is used with Transfer files to remote servers. B<--transfer> is used with
@ -273,60 +417,6 @@ Transfer, Return, Cleanup. Short hand for:
--transfer --return I<filename> --cleanup --transfer --return I<filename> --cleanup
=item B<--return> I<filename> (not implemented)
Transfer files from remote servers. B<--return> is used with
B<--sshlogin> when the arguments are files on the remote servers. When
processing is done the file I<filename> will be transfered
from the remote server using B<rsync> and will be put relative to
the default login dir. E.g.
echo foo/bar.txt | parallel \
--sshlogin server.example.com --return {}.out touch {}.out
This will transfer the file I<$HOME/foo/bar.txt.out> from the server
I<server.example.com> to the file I<foo/bar.txt.out> after running
B<touch foo/bar.txt.out> on I<server.example.com>.
echo /tmp/foo/bar.txt | parallel \
--sshlogin server.example.com --return {}.out touch {}.out
This will transfer the file I</tmp/foo/bar.txt.out> from the server
I<server.example.com> to the file I</tmp/foo/bar.txt.out> after running
B<touch /tmp/foo/bar.txt.out> on I<server.example.com>.
Multiple files can be transfered by repeating the options multiple
times:
echo /tmp/foo/bar.txt | \
parallel --sshlogin server.example.com \
--return {}.out --return {}.out2 touch {}.out {}.out2
B<--return> is often used with B<--transfer> and B<--cleanup>.
B<--return> is ignored when used with B<--sshlogin :> or when not used with B<--sshlogin>.
=item B<--cleanup> (not implemented)
Remove transfered files. B<--cleanup> will remove the transfered files
on the remote server after processing is done.
find log -name '*gz' | parallel \
--sshlogin server.example.com --transfer --return {.}.bz2 \
--cleanup "zcat {} | bzip -9 >{.}.bz2"
With B<--transfer> the file transfered to the remote server will be
removed on the remote server. Directories created will not be removed
- even if they are empty.
With B<--return> the file transfered from the remote server will be
removed on the remote server. Directories created will not be removed
- even if they are empty.
B<--cleanup> is ignored when not used with B<--transfer> or B<--return>.
=item B<--ungroup> =item B<--ungroup>
=item B<-u> =item B<-u>
@ -335,6 +425,13 @@ Ungroup output. Output is printed as soon as possible. This may cause
output from different commands to be mixed. Can be reversed with B<-g>. output from different commands to be mixed. Can be reversed with B<-g>.
=item B<-U> I<string>
=item B<--extensionreplace> I<string>
Use the replacement string I<string> instead of {.} for input line without extension.
=item B<--use-cpus-instead-of-cores> (not implemented) =item B<--use-cpus-instead-of-cores> (not implemented)
Count the number of CPUs instead of cores. When computing how many Count the number of CPUs instead of cores. When computing how many
@ -348,7 +445,14 @@ Normal users will not need this option.
=item B<-v> =item B<-v>
Verbose. Print the job to be run on STDOUT. Can be reversed with Verbose. Print the job to be run on STDOUT. Can be reversed with
B<--silent>. B<--silent>. See also B<-t>.
=item B<--version>
=item B<-V>
Print the version B<parallel> and exit.
=item B<--xargs> =item B<--xargs>
@ -800,6 +904,10 @@ Copyright (C) 2008,2009,2010 Ole Tange, http://ole.tange.dk
Copyright (C) 2010 Ole Tange, http://ole.tange.dk and Free Software Copyright (C) 2010 Ole Tange, http://ole.tange.dk and Free Software
Foundation, Inc. Foundation, Inc.
Parts of the manual concerning B<xargs> compatability is inspired by
the manual of B<xargs> from GNU findutils 4.4.2.
=head1 LICENSE =head1 LICENSE
@ -949,7 +1057,6 @@ GetOptions("debug|D" => \$::opt_D,
# xargs-compatibility - implemented, man, unittest # xargs-compatibility - implemented, man, unittest
"max-procs|P=s" => \$::opt_P, "max-procs|P=s" => \$::opt_P,
"delimiter|d=s" => \$::opt_d, "delimiter|d=s" => \$::opt_d,
# xargs-compatibility - implemented, unittest - man missing
"max-chars|s=i" => \$::opt_s, "max-chars|s=i" => \$::opt_s,
"arg-file|a=s" => \$::opt_a, "arg-file|a=s" => \$::opt_a,
"no-run-if-empty|r" => \$::opt_r, "no-run-if-empty|r" => \$::opt_r,
@ -957,19 +1064,21 @@ GetOptions("debug|D" => \$::opt_D,
"E=s" => \$::opt_E, "E=s" => \$::opt_E,
"eof|e:s" => \$::opt_E, "eof|e:s" => \$::opt_E,
"max-args|n=i" => \$::opt_n, "max-args|n=i" => \$::opt_n,
"verbose|t" => \$::opt_verbose,
"help|h" => \$::opt_help, "help|h" => \$::opt_help,
"verbose|t" => \$::opt_verbose,
"version|V" => \$::opt_version, "version|V" => \$::opt_version,
## xargs-compatibility - implemented - unittest missing - man missing "show-limits" => \$::opt_show_limits,
## xargs-compatibility - implemented, man - unittest missing
"interactive|p" => \$::opt_p, "interactive|p" => \$::opt_p,
## How to unittest? tty skal emuleres ## How to unittest? tty skal emuleres
# xargs-compatibility - implemented, unittest - man missing
#none
# xargs-compatability - unimplemented # xargs-compatability - unimplemented
"L=i" => \$::opt_L, "L=i" => \$::opt_L,
"max-lines|l:i" => \$::opt_l, "max-lines|l:i" => \$::opt_l,
## (echo a b;echo c) | xargs -l1 echo ## (echo a b;echo c) | xargs -l1 echo
## (echo a b' ';echo c) | xargs -l1 echo ## (echo a b' ';echo c) | xargs -l1 echo
"show-limits" => \$::opt_show_limits,
"exit|x" => \$::opt_x, "exit|x" => \$::opt_x,
) || die_usage(); ) || die_usage();
@ -1017,6 +1126,7 @@ if(defined $::opt_help) { die_usage(); }
if(defined $::opt_number_of_cpus) { print no_of_cpus(),"\n"; exit(0); } if(defined $::opt_number_of_cpus) { print no_of_cpus(),"\n"; exit(0); }
if(defined $::opt_number_of_cores) { print no_of_cores(),"\n"; exit(0); } if(defined $::opt_number_of_cores) { print no_of_cores(),"\n"; exit(0); }
if(defined $::opt_version) { version(); exit(0); } if(defined $::opt_version) { version(); exit(0); }
if(defined $::opt_show_limits) { show_limits(); }
if(defined $::opt_a) { if(defined $::opt_a) {
if(not open(ARGFILE,"<".$::opt_a)) { if(not open(ARGFILE,"<".$::opt_a)) {
@ -1208,12 +1318,7 @@ sub max_length_of_command_line {
# Find the max_length of a command line # Find the max_length of a command line
# First find an upper bound # First find an upper bound
if(not $Global::command_line_max_len) { if(not $Global::command_line_max_len) {
my $len = 10; $Global::command_line_max_len = real_max_length();
do {
$len *= 10;
} while (is_acceptable_command_line_length($len));
# Then search for the actual max length between 0 and upper bound
$Global::command_line_max_len = binary_find_max_length(int(($len)/10),$len);
if($::opt_s) { if($::opt_s) {
if($::opt_s <= $Global::command_line_max_len) { if($::opt_s <= $Global::command_line_max_len) {
$Global::command_line_max_len = $::opt_s; $Global::command_line_max_len = $::opt_s;
@ -1226,6 +1331,16 @@ sub max_length_of_command_line {
return $Global::command_line_max_len; return $Global::command_line_max_len;
} }
sub real_max_length {
my $len = 10;
do {
$len *= 10;
} while (is_acceptable_command_line_length($len));
# Then search for the actual max length between 0 and upper bound
return binary_find_max_length(int(($len)/10),$len);
}
sub binary_find_max_length { sub binary_find_max_length {
# Given a lower and upper bound find the max_length of a command line # Given a lower and upper bound find the max_length of a command line
my ($lower, $upper) = (@_); my ($lower, $upper) = (@_);
@ -1465,6 +1580,7 @@ sub init_run_jobs {
# Remember the original STDOUT and STDERR # Remember the original STDOUT and STDERR
open $Global::original_stdout, ">&STDOUT" or die "Can't dup STDOUT: $!"; open $Global::original_stdout, ">&STDOUT" or die "Can't dup STDOUT: $!";
open $Global::original_stderr, ">&STDERR" or die "Can't dup STDERR: $!"; open $Global::original_stderr, ">&STDERR" or die "Can't dup STDERR: $!";
open $Global::original_stdin, "<&STDIN" or die "Can't dup STDIN: $!";
$Global::running_jobs=0; $Global::running_jobs=0;
$SIG{USR1} = \&ListRunningJobs; $SIG{USR1} = \&ListRunningJobs;
$Global::original_sigterm = $SIG{TERM}; $Global::original_sigterm = $SIG{TERM};
@ -1610,13 +1726,22 @@ sub start_job {
$Global::running_jobs++; $Global::running_jobs++;
debug("$Global::running_jobs processes. Starting: $command\n"); debug("$Global::running_jobs processes. Starting: $command\n");
#print STDERR "LEN".length($command)."\n"; #print STDERR "LEN".length($command)."\n";
$Global::job_start_sequence++;
if($::opt_a and $Global::job_start_sequence == 1) {
# Give STDIN to the first job if using -a
$pid = open3("<&STDIN", ">&STDOUT", ">&STDERR", $command) ||
die("open3 failed. Report a bug to <par\@tange.dk>\n");
# Re-open to avoid complaining
open STDIN, "<&", $Global::original_stdin or die "Can't dup \$Global::original_stdin: $!";
} else {
$pid = open3(gensym, ">&STDOUT", ">&STDERR", $command) || $pid = open3(gensym, ">&STDOUT", ">&STDERR", $command) ||
die("open3 failed. Report a bug to <par\@tange.dk>\n"); die("open3 failed. Report a bug to <par\@tange.dk>\n");
}
debug("started: $command\n"); debug("started: $command\n");
open STDOUT, ">&", $Global::original_stdout or die "Can't dup \$oldout: $!"; open STDOUT, ">&", $Global::original_stdout or die "Can't dup \$Global::original_stdout: $!";
open STDERR, ">&", $Global::original_stderr or die "Can't dup \$oldout: $!"; open STDERR, ">&", $Global::original_stderr or die "Can't dup \$Global::original_stderr: $!";
$Global::job_start_sequence++;
if($Global::grouped) { if($Global::grouped) {
return ("seq" => $Global::job_start_sequence, return ("seq" => $Global::job_start_sequence,
"pid" => $pid, "pid" => $pid,
@ -1767,6 +1892,15 @@ sub version {
); );
} }
sub show_limits {
print("Maximal size of command: ",real_max_length(),"\n",
"Maximal used size of command: ",max_length_of_command_line(),"\n",
"\n",
"Execution of will continue now, and it will try to read its input\n",
"and run commands; if this is not what you wanted to happen, please\n",
"press CTRL-D or CTRL-C\n");
}
# #
# Debugging # Debugging

View file

@ -18,6 +18,12 @@
8 8
9 9
10 10
3
1
2
1
3
2
replace replace
replace replace
replace replace
@ -101,3 +107,21 @@ echo far
echo bar echo bar
echo car echo car
echo far echo far
Maximal size of command: 131071
Maximal used size of command: 131071
Execution of will continue now, and it will try to read its input
and run commands; if this is not what you wanted to happen, please
press CTRL-D or CTRL-C
bar
car
far
Maximal size of command: 131071
Maximal used size of command: 100
Execution of will continue now, and it will try to read its input
and run commands; if this is not what you wanted to happen, please
press CTRL-D or CTRL-C
bar
car
far

View file

@ -9,6 +9,17 @@ seq 1 10 >/tmp/$$
$PAR -a /tmp/$$ echo $PAR -a /tmp/$$ echo
$PAR --arg-file /tmp/$$ echo $PAR --arg-file /tmp/$$ echo
cd input-files/test15
# echo 3 | xargs -P 2 -n 1 -a files cat -
echo 3 | parallel -k -P 2 -n 1 -a files cat -
# echo 3 | xargs -I {} -P 2 -n 1 -a files cat {} -
# Should give:
# 3
# 1
# 2
echo 3 | parallel -k -I {} -P 2 -n 1 -a files cat {} -
# Test -i and --replace: Replace with argument # Test -i and --replace: Replace with argument
(echo a; echo END; echo b) | $PAR -k -i -eEND echo repl{}ce (echo a; echo END; echo b) | $PAR -k -i -eEND echo repl{}ce
(echo a; echo END; echo b) | $PAR -k --replace -eEND echo repl{}ce (echo a; echo END; echo b) | $PAR -k --replace -eEND echo repl{}ce
@ -70,3 +81,7 @@ $PAR --version | wc -l
# Test --verbose and -t # Test --verbose and -t
(echo b; echo c; echo f) | $PAR -k -t echo {}ar 2>&1 >/dev/null (echo b; echo c; echo f) | $PAR -k -t echo {}ar 2>&1 >/dev/null
(echo b; echo c; echo f) | $PAR -k --verbose echo {}ar 2>&1 >/dev/null (echo b; echo c; echo f) | $PAR -k --verbose echo {}ar 2>&1 >/dev/null
# Test --show-limits
(echo b; echo c; echo f) | $PAR -k --show-limits echo {}ar
(echo b; echo c; echo f) | $PAR -k --show-limits -s 100 echo {}ar

View file

@ -18,6 +18,12 @@
8 8
9 9
10 10
3
1
2
1
3
2
replace replace
replace replace
replace replace
@ -101,3 +107,21 @@ echo far
echo bar echo bar
echo car echo car
echo far echo far
Maximal size of command: 131071
Maximal used size of command: 131071
Execution of will continue now, and it will try to read its input
and run commands; if this is not what you wanted to happen, please
press CTRL-D or CTRL-C
bar
car
far
Maximal size of command: 131071
Maximal used size of command: 100
Execution of will continue now, and it will try to read its input
and run commands; if this is not what you wanted to happen, please
press CTRL-D or CTRL-C
bar
car
far