Wrote missing man for xargs compatability.

Fixed bug in --arg-file. Implemented --show-limits.
2024-11-22 14:07:55 +00:00 · 2010-04-21 21:28:00 +02:00 · 2010-04-21 21:28:00 +02:00 · 495d8bc0bd
parent 3e50ba19cf
commit 495d8bc0bd
4 changed files with 271 additions and 74 deletions
--- a/src/parallel
+++ b/src/parallel
@ -57,6 +57,35 @@ Use NUL as delimiter.  Normally input lines will end in \n
 for processing filenames that may contain \n (newline).
 =item B<--arg-file>=I<file>
 =item B<-a> I<file>
 Read items from file instead of standard input.  If you use this
 option, stdin is given to the first process run.  Otherwise, stdin is
 redirected from /dev/null.
 =item B<--cleanup> (not implemented)
 Remove transfered files. B<--cleanup> will remove the transfered files
 on the remote server after processing is done.
  find log -name '*gz' | parallel \
    --sshlogin server.example.com --transfer --return {.}.bz2 \
    --cleanup "zcat {} | bzip -9 >{.}.bz2"
 With B<--transfer> the file transfered to the remote server will be
 removed on the remote server.  Directories created will not be removed
 - even if they are empty.
 With B<--return> the file transfered from the remote server will be
 removed on the remote server.  Directories created will not be removed
 - even if they are empty.
 B<--cleanup> is ignored when not used with B<--transfer> or B<--return>.
 =item B<--command>
 =item B<-c>
@ -81,6 +110,24 @@ as \n, or an octal or hexadecimal escape code.  Octal and
 hexadecimal escape codes are understood as for the printf command.
 Multibyte characters are not supported.
 =item B<-E> I<eof-str>
 Set the end of file string to eof-str.  If the end of file string
 occurs as a line of input, the rest of the input is ignored.  If
 neither B<-E> nor B<-e> is used, no end of file string is used.
 =item B<--eof>[=I<eof-str>]
 =item B<-e>[I<eof-str>]
 This option is a synonym for the B<-E> option.  Use B<-E> instead,
 because it is POSIX compliant for B<xargs> while this option is not.
 If I<eof-str> is omitted, there is no end of file string.  If neither
 B<-E> nor B<-e> is used, no end of file string is used.
 =item B<--file>
@ -99,17 +146,25 @@ Group output.  Output from each jobs is grouped together and is only
 printed when the command is finished. STDERR first followed by STDOUT.
 B<-g> is the default. Can be reversed with B<-u>.
 =item B<--help>
 =item B<-h>
 Print a summary of the options to B<parallel> and exit.
 =item B<-I> I<string>
 Use the replacement string I<string> instead of {}.
-=item B<-U> I<string>
+=item B<--replace>[=I<replace-str>]
-=item B<--extensionreplace> I<string>
+=item B<-i>[I<replace-str>]
-Use the replacement string I<string> instead of {.} for input line without extension.
+This option is a synonym for B<-I>I<replace-str> if I<replace-str> is
 specified, and for B<-I>{} otherwise.  This option is deprecated;
 use B<-I> instead.
 =item B<--jobs> I<N>
@ -170,6 +225,16 @@ If the evaluated number is less than 1 then 1 will be used.  See also
 Keep sequence of output same as the order of input. If jobs 1 2 3 4
 end in the sequence 3 1 4 2 the output will still be 1 2 3 4.
 =item B<--max-args>=I<max-args>
 =item B<-n> I<max-args>
 Use at most I<max-args> arguments per command line.  Fewer than
 I<max-args> arguments will be used if the size (see the B<-s> option)
 is exceeded, unless the B<-x> option is given, in which case
 B<parallel> will exit.
 Only used with B<-m> and B<-X>.
 =item B<--number-of-cpus>
@ -193,6 +258,75 @@ QUOTING. Most people will never need this.  Quoting is disabled by
 default.
 =item B<--interactive>
 =item B<-p>
 Prompt the user about whether to run each command line and read a line
 from the terminal.  Only run the command line if the response starts
 with 'y' or 'Y'.  Implies B<-t>.
 =item B<--no-run-if-empty>
 =item B<-r>
 If the standard input does not contain any nonblanks, do not run the
 command.
 =item B<--return> I<filename> (not implemented)
 Transfer files from remote servers. B<--return> is used with
 B<--sshlogin> when the arguments are files on the remote servers. When
 processing is done the file I<filename> will be transfered
 from the remote server using B<rsync> and will be put relative to
 the default login dir. E.g.
  echo foo/bar.txt | parallel \
    --sshlogin server.example.com --return {}.out touch {}.out
 This will transfer the file I<$HOME/foo/bar.txt.out> from the server
 I<server.example.com> to the file I<foo/bar.txt.out> after running
 B<touch foo/bar.txt.out> on I<server.example.com>.
  echo /tmp/foo/bar.txt | parallel \
    --sshlogin server.example.com --return {}.out touch {}.out
 This will transfer the file I</tmp/foo/bar.txt.out> from the server
 I<server.example.com> to the file I</tmp/foo/bar.txt.out> after running
 B<touch /tmp/foo/bar.txt.out> on I<server.example.com>.
 Multiple files can be transfered by repeating the options multiple
 times:
  echo /tmp/foo/bar.txt | \
    parallel --sshlogin server.example.com \
    --return {}.out --return {}.out2 touch {}.out {}.out2
 B<--return> is often used with B<--transfer> and B<--cleanup>.
 B<--return> is ignored when used with B<--sshlogin :> or when not used with B<--sshlogin>.
 =item B<--max-chars>=I<max-chars>
 =item B<-s> I<max-chars>
 Use at most max-chars characters per command line, including the
 command and initial-arguments and the terminating nulls at the ends of
 the argument strings.  The largest allowed value is system-dependent,
 and is calculated as the argument length limit for exec, less the size
 of your environment.  The default value is the maximum.
 =item B<--show-limits>
 Display the limits on the command-line length which are imposed by the
 operating system and the -s option.  Pipe the input from /dev/null
 (and perhaps specify --no-run-if-empty) if you don't want B<parallel>
 to do anything.
 =item B<-S> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented)
 =item B<--sshlogin> I<[ncpu/]sshlogin[,[ncpu/]sshlogin]> (not implemented)
@ -240,6 +374,16 @@ Silent.  The job to be run will not be printed. This is the default.
 Can be reversed with B<-v>.
 =item B<--verbose>
 =item B<-t>
 Print the command line on the standard error output before executing
 it.
 See also B<-v>.
 =item B<--transfer> (not implemented)
 Transfer files to remote servers. B<--transfer> is used with
@ -273,60 +417,6 @@ Transfer, Return, Cleanup. Short hand for:
  --transfer --return I<filename> --cleanup
 =item B<--return> I<filename> (not implemented)
 Transfer files from remote servers. B<--return> is used with
 B<--sshlogin> when the arguments are files on the remote servers. When
 processing is done the file I<filename> will be transfered
 from the remote server using B<rsync> and will be put relative to
 the default login dir. E.g.
  echo foo/bar.txt | parallel \
    --sshlogin server.example.com --return {}.out touch {}.out
 This will transfer the file I<$HOME/foo/bar.txt.out> from the server
 I<server.example.com> to the file I<foo/bar.txt.out> after running
 B<touch foo/bar.txt.out> on I<server.example.com>.
  echo /tmp/foo/bar.txt | parallel \
    --sshlogin server.example.com --return {}.out touch {}.out
 This will transfer the file I</tmp/foo/bar.txt.out> from the server
 I<server.example.com> to the file I</tmp/foo/bar.txt.out> after running
 B<touch /tmp/foo/bar.txt.out> on I<server.example.com>.
 Multiple files can be transfered by repeating the options multiple
 times:
  echo /tmp/foo/bar.txt | \
    parallel --sshlogin server.example.com \
    --return {}.out --return {}.out2 touch {}.out {}.out2
 B<--return> is often used with B<--transfer> and B<--cleanup>.
 B<--return> is ignored when used with B<--sshlogin :> or when not used with B<--sshlogin>.
 =item B<--cleanup> (not implemented)
 Remove transfered files. B<--cleanup> will remove the transfered files
 on the remote server after processing is done.
  find log -name '*gz' | parallel \
    --sshlogin server.example.com --transfer --return {.}.bz2 \
    --cleanup "zcat {} | bzip -9 >{.}.bz2"
 With B<--transfer> the file transfered to the remote server will be
 removed on the remote server.  Directories created will not be removed
 - even if they are empty.
 With B<--return> the file transfered from the remote server will be
 removed on the remote server.  Directories created will not be removed
 - even if they are empty.
 B<--cleanup> is ignored when not used with B<--transfer> or B<--return>.
 =item B<--ungroup>
 =item B<-u>
@ -335,6 +425,13 @@ Ungroup output.  Output is printed as soon as possible. This may cause
 output from different commands to be mixed. Can be reversed with B<-g>.
 =item B<-U> I<string>
 =item B<--extensionreplace> I<string>
 Use the replacement string I<string> instead of {.} for input line without extension.
 =item B<--use-cpus-instead-of-cores> (not implemented)
 Count the number of CPUs instead of cores. When computing how many
@ -348,7 +445,14 @@ Normal users will not need this option.
 =item B<-v>
 Verbose.  Print the job to be run on STDOUT. Can be reversed with
-B<--silent>.
+B<--silent>. See also B<-t>.
 =item B<--version>
 =item B<-V>
 Print the version B<parallel> and exit.
 =item B<--xargs>
@ -800,6 +904,10 @@ Copyright (C) 2008,2009,2010 Ole Tange, http://ole.tange.dk
 Copyright (C) 2010 Ole Tange, http://ole.tange.dk and Free Software
 Foundation, Inc.
 Parts of the manual concerning B<xargs> compatability is inspired by
 the manual of B<xargs> from GNU findutils 4.4.2.
 =head1 LICENSE
@ -949,7 +1057,6 @@ GetOptions("debug|D" => \$::opt_D,
 	   # xargs-compatibility - implemented, man, unittest
 	   "max-procs|P=s" => \$::opt_P,
 	   "delimiter|d=s" => \$::opt_d,
 	   # xargs-compatibility - implemented, unittest - man missing
 	   "max-chars|s=i" => \$::opt_s,
 	   "arg-file|a=s" => \$::opt_a,
 	   "no-run-if-empty|r" => \$::opt_r,
@ -957,19 +1064,21 @@ GetOptions("debug|D" => \$::opt_D,
 	   "E=s" => \$::opt_E,
 	   "eof|e:s" => \$::opt_E,
 	   "max-args|n=i" => \$::opt_n,
 	   "verbose|t" => \$::opt_verbose,
 	   "help|h" => \$::opt_help,
 	   "verbose|t" => \$::opt_verbose,
 	   "version|V" => \$::opt_version,
-	   ## xargs-compatibility - implemented - unittest missing - man missing
+	   "show-limits" => \$::opt_show_limits,
 	   ## xargs-compatibility - implemented, man - unittest missing
 	   "interactive|p" => \$::opt_p,
 	   ## How to unittest? tty skal emuleres
 	   # xargs-compatibility - implemented, unittest - man missing
 	   #none
 	   # xargs-compatability - unimplemented
 	   "L=i" => \$::opt_L,
 	   "max-lines|l:i" => \$::opt_l,
 	   ## (echo a b;echo c) | xargs -l1 echo
 	   ## (echo a b' ';echo c) | xargs -l1 echo
 	   "show-limits" => \$::opt_show_limits,
 	   "exit|x" => \$::opt_x,
    ) || die_usage();
@ -1017,6 +1126,7 @@ if(defined $::opt_help) { die_usage(); }
 if(defined $::opt_number_of_cpus) { print no_of_cpus(),"\n"; exit(0); }
 if(defined $::opt_number_of_cores) { print no_of_cores(),"\n"; exit(0); }
 if(defined $::opt_version) { version(); exit(0); }
 if(defined $::opt_show_limits) { show_limits(); }
 if(defined $::opt_a) {
    if(not open(ARGFILE,"<".$::opt_a)) {
@ -1208,12 +1318,7 @@ sub max_length_of_command_line {
    # Find the max_length of a command line
    # First find an upper bound
    if(not $Global::command_line_max_len) {
-	my $len = 10;
+	$Global::command_line_max_len = real_max_length();
 	do {
 	    $len *= 10;
 	} while (is_acceptable_command_line_length($len));
 	# Then search for the actual max length between 0 and upper bound
 	$Global::command_line_max_len = binary_find_max_length(int(($len)/10),$len);
 	if($::opt_s) {
 	    if($::opt_s <= $Global::command_line_max_len) {
 		$Global::command_line_max_len = $::opt_s;
@ -1226,6 +1331,16 @@ sub max_length_of_command_line {
    return $Global::command_line_max_len;
 }
 sub real_max_length {
    my $len = 10;
    do {
 	$len *= 10;
    } while (is_acceptable_command_line_length($len));
    # Then search for the actual max length between 0 and upper bound
    return binary_find_max_length(int(($len)/10),$len);
 }
 sub binary_find_max_length {
    # Given a lower and upper bound find the max_length of a command line
    my ($lower, $upper) = (@_);
@ -1465,6 +1580,7 @@ sub init_run_jobs {
    # Remember the original STDOUT and STDERR
    open $Global::original_stdout, ">&STDOUT" or die "Can't dup STDOUT: $!";
    open $Global::original_stderr, ">&STDERR" or die "Can't dup STDERR: $!";
    open $Global::original_stdin, "<&STDIN" or die "Can't dup STDIN: $!";
    $Global::running_jobs=0;
    $SIG{USR1} = \&ListRunningJobs;
    $Global::original_sigterm = $SIG{TERM};
@ -1610,13 +1726,22 @@ sub start_job {
    $Global::running_jobs++;
    debug("$Global::running_jobs processes. Starting: $command\n");
    #print STDERR "LEN".length($command)."\n";
    $pid = open3(gensym, ">&STDOUT", ">&STDERR", $command) || 
 	die("open3 failed. Report a bug to <par\@tange.dk>\n");
    debug("started: $command\n");
    open STDOUT, ">&", $Global::original_stdout or die "Can't dup \$oldout: $!";
    open STDERR, ">&", $Global::original_stderr or die "Can't dup \$oldout: $!";
    $Global::job_start_sequence++;
    if($::opt_a and $Global::job_start_sequence == 1) {
 	# Give STDIN to the first job if using -a
 	$pid = open3("<&STDIN", ">&STDOUT", ">&STDERR", $command) || 
 	    die("open3 failed. Report a bug to <par\@tange.dk>\n");
 	# Re-open to avoid complaining
 	open STDIN, "<&", $Global::original_stdin or die "Can't dup \$Global::original_stdin: $!";
    } else {
 	$pid = open3(gensym, ">&STDOUT", ">&STDERR", $command) || 
 	    die("open3 failed. Report a bug to <par\@tange.dk>\n");
    }
    debug("started: $command\n");
    open STDOUT, ">&", $Global::original_stdout or die "Can't dup \$Global::original_stdout: $!";
    open STDERR, ">&", $Global::original_stderr or die "Can't dup \$Global::original_stderr: $!";
    if($Global::grouped) {
 	return ("seq" => $Global::job_start_sequence,
 		"pid" => $pid,
@ -1767,6 +1892,15 @@ sub version {
 	);
 }
 sub show_limits {
    print("Maximal size of command: ",real_max_length(),"\n",
 	  "Maximal used size of command: ",max_length_of_command_line(),"\n",
 	  "\n",
 	  "Execution of  will continue now, and it will try to read its input\n",
 	  "and run commands; if this is not what you wanted to happen, please\n",
 	  "press CTRL-D or CTRL-C\n");
 }
 #
 # Debugging
--- a/unittest/actual-results/test15
+++ b/unittest/actual-results/test15
@ -18,6 +18,12 @@
 8
 9
 10
 3
 1
 2
 1
 3
 2
 replace
 replace
 replace
@ -101,3 +107,21 @@ echo far
 echo bar
 echo car
 echo far
 Maximal size of command: 131071
 Maximal used size of command: 131071
 Execution of  will continue now, and it will try to read its input
 and run commands; if this is not what you wanted to happen, please
 press CTRL-D or CTRL-C
 bar
 car
 far
 Maximal size of command: 131071
 Maximal used size of command: 100
 Execution of  will continue now, and it will try to read its input
 and run commands; if this is not what you wanted to happen, please
 press CTRL-D or CTRL-C
 bar
 car
 far
--- a/unittest/tests-to-run/test15.sh
+++ b/unittest/tests-to-run/test15.sh
@ -9,6 +9,17 @@ seq 1 10 >/tmp/$$
 $PAR -a /tmp/$$ echo
 $PAR --arg-file /tmp/$$ echo
 cd input-files/test15
 # echo 3 | xargs -P 2 -n 1 -a files cat -
 echo 3 | parallel -k -P 2 -n 1 -a files cat -
 # echo 3 | xargs -I {} -P 2 -n 1 -a files cat {} -
 # Should give:
 # 3
 # 1
 # 2
 echo 3 | parallel -k -I {} -P 2 -n 1 -a files cat {} -
 # Test -i and --replace: Replace with argument
 (echo a; echo END; echo b) | $PAR -k -i -eEND echo repl{}ce
 (echo a; echo END; echo b) | $PAR -k --replace -eEND echo repl{}ce
@ -70,3 +81,7 @@ $PAR --version | wc -l
 # Test --verbose and -t
 (echo b; echo c; echo f) | $PAR -k -t echo {}ar 2>&1 >/dev/null
 (echo b; echo c; echo f) | $PAR -k --verbose echo {}ar 2>&1 >/dev/null
 # Test --show-limits
 (echo b; echo c; echo f) | $PAR -k --show-limits echo {}ar
 (echo b; echo c; echo f) | $PAR -k --show-limits -s 100 echo {}ar
--- a/unittest/wanted-results/test15
+++ b/unittest/wanted-results/test15
@ -18,6 +18,12 @@
 8
 9
 10
 3
 1
 2
 1
 3
 2
 replace
 replace
 replace
@ -101,3 +107,21 @@ echo far
 echo bar
 echo car
 echo far
 Maximal size of command: 131071
 Maximal used size of command: 131071
 Execution of  will continue now, and it will try to read its input
 and run commands; if this is not what you wanted to happen, please
 press CTRL-D or CTRL-C
 bar
 car
 far
 Maximal size of command: 131071
 Maximal used size of command: 100
 Execution of  will continue now, and it will try to read its input
 and run commands; if this is not what you wanted to happen, please
 press CTRL-D or CTRL-C
 bar
 car
 far