Re-fixed #56115: Compute max-line-length fast on CygWin.

Fixed bug #56214: *Q is not set in shell_quote_scalar.
This commit is contained in:
Ole Tange 2019-04-28 22:13:10 +02:00
parent d6a3848b68
commit 81741d3dbc
3 changed files with 56 additions and 22 deletions

View file

@ -1,5 +1,8 @@
Quote of the month: Quote of the month:
Amazingly useful script!
-- unxusr@reddit.com
GNU parallel really changed how I do a lot of data processing stuff GNU parallel really changed how I do a lot of data processing stuff
-- Brendan Dolan-Gavitt @moyix@twitter -- Brendan Dolan-Gavitt @moyix@twitter

View file

@ -2524,14 +2524,15 @@ sub shell_quote_scalar($) {
*shell_quote_scalar = \&shell_quote_scalar_default; *shell_quote_scalar = \&shell_quote_scalar_default;
} }
# The sub is now redefined. Call it # The sub is now redefined. Call it
return shell_quote_scalar(@_); return shell_quote_scalar($_[0]);
} }
sub Q($) { sub Q($) {
# Q alias for ::shell_quote_scalar # Q alias for ::shell_quote_scalar
my $ret = shell_quote_scalar($_[0]);
no warnings 'redefine'; no warnings 'redefine';
*Q = \&::shell_quote_scalar; *Q = \&::shell_quote_scalar;
return Q(@_); return $ret;
} }
sub shell_quote_file($) { sub shell_quote_file($) {
@ -2578,8 +2579,9 @@ sub perl_quote_scalar($) {
# -w complains about prototype # -w complains about prototype
sub pQ($) { sub pQ($) {
# pQ alias for ::perl_quote_scalar # pQ alias for ::perl_quote_scalar
my $ret = perl_quote_scalar($_[0]);
*pQ = \&::perl_quote_scalar; *pQ = \&::perl_quote_scalar;
return pQ(@_); return $ret;
} }
sub unquote_printf() { sub unquote_printf() {
@ -10899,8 +10901,9 @@ sub real_max_length($) {
# The maximal command line length # The maximal command line length
# Use an upper bound of 100 MB if the shell allows for infinite long lengths # Use an upper bound of 100 MB if the shell allows for infinite long lengths
my $upper = 100_000_000; my $upper = 100_000_000;
# 268 makes the search faster on CygWin - 1000 is supported everywhere # 1000 is supported everywhere, so the search can start anywhere 1..999
my $len = 268; # 324 makes the search much faster on CygWin, so let us use that
my $len = 324;
do { do {
if($len > $upper) { return $len }; if($len > $upper) { return $len };
$len *= 16; $len *= 16;
@ -11483,15 +11486,18 @@ sub new($) {
sub Q($) { sub Q($) {
# Q alias for ::shell_quote_scalar # Q alias for ::shell_quote_scalar
my $ret = ::Q($_[0]);
no warnings 'redefine'; no warnings 'redefine';
*Q = \&::shell_quote_scalar; *Q = \&::Q;
return Q(@_); return $ret;
} }
sub pQ($) { sub pQ($) {
# pQ alias for ::perl_quote_scalar # pQ alias for ::perl_quote_scalar
*pQ = \&::perl_quote_scalar; my $ret = ::pQ($_[0]);
return pQ(@_); no warnings 'redefine';
*pQ = \&::pQ;
return $ret;
} }
sub total_jobs() { sub total_jobs() {

View file

@ -62,9 +62,13 @@ or download it at: https://doi.org/10.5281/zenodo.1146014
Otherwise start by watching the intro videos for a quick introduction: Otherwise start by watching the intro videos for a quick introduction:
http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Then browse through the B<EXAMPLE>s after the list of B<OPTIONS> in If you need a one page printable cheat sheet you can find it on:
B<man parallel> (Use B<LESS=+/EXAMPLE: man parallel>). That will give https://www.gnu.org/software/parallel/parallel_cheat.pdf
you an idea of what GNU B<parallel> is capable of.
You can find a lot of B<EXAMPLE>s of use after the list of B<OPTIONS>
in B<man parallel> (Use B<LESS=+/EXAMPLE: man parallel>). That will
give you an idea of what GNU B<parallel> is capable of, and you may
find a solution you can simply adapt to your situation.
If you want to dive even deeper: spend a couple of hours walking If you want to dive even deeper: spend a couple of hours walking
through the tutorial (B<man parallel_tutorial>). Your command line through the tutorial (B<man parallel_tutorial>). Your command line
@ -663,11 +667,11 @@ equivalent: B<--delay 100000> and B<--delay 1d3.5h16.6m4s>.
=item B<--dry-run> =item B<--dry-run>
Print the job to run on stdout (standard output), but do not run the Print the job to run on stdout (standard output), but do not run the
job. Use B<-v -v> to include the wrapping that GNU Parallel generates job. Use B<-v -v> to include the wrapping that GNU B<parallel>
(for remote jobs, B<--tmux>, B<--nice>, B<--pipe>, B<--pipepart>, generates (for remote jobs, B<--tmux>, B<--nice>, B<--pipe>,
B<--fifo> and B<--cat>). Do not count on this literally, though, as the B<--pipepart>, B<--fifo> and B<--cat>). Do not count on this
job may be scheduled on another computer or the local computer if : is literally, though, as the job may be scheduled on another computer or
in the list. the local computer if : is in the list.
=item B<--eof>[=I<eof-str>] =item B<--eof>[=I<eof-str>]
@ -3391,6 +3395,27 @@ If B<-j0> normally spawns 252 jobs, then the above will try to spawn
this technique with no problems. To raise the 32000 jobs limit raise this technique with no problems. To raise the 32000 jobs limit raise
/proc/sys/kernel/pid_max to 4194303. /proc/sys/kernel/pid_max to 4194303.
If you do not need GNU B<parallel> to have control over each job (so
no need for B<--retries> or B<--joblog> or similar), then it can be
even faster if you can generate the command lines and pipe those to a
shell. So if you can do this:
mygenerator | sh
Then that can be parallelized like this:
mygenerator | parallel --pipe --block 10M sh
E.g.
mygenerator() {
seq 10000000 | perl -pe 'print "echo This is fast job number "';
}
mygenerator | parallel --pipe --block 10M sh
The overhead is 100000 times smaller namely around 100 nanoseconds per
job.
=head1 EXAMPLE: Using shell variables =head1 EXAMPLE: Using shell variables
@ -4308,7 +4333,7 @@ In some cases you can run on more CPUs and computers during the night:
cp night_server_list ~/.parallel/sshloginfile cp night_server_list ~/.parallel/sshloginfile
tail -n+0 -f jobqueue | parallel --jobs jobfile -S .. tail -n+0 -f jobqueue | parallel --jobs jobfile -S ..
GNU Parallel discovers if B<jobfile> or B<~/.parallel/sshloginfile> GNU B<parallel> discovers if B<jobfile> or B<~/.parallel/sshloginfile>
changes. changes.
There is a a small issue when using GNU B<parallel> as queue There is a a small issue when using GNU B<parallel> as queue
@ -4346,8 +4371,8 @@ If the files to be processed are in a tar file then unpacking one file
and processing it immediately may be faster than first unpacking all and processing it immediately may be faster than first unpacking all
files. Set up the dir processor as above and unpack into the dir. files. Set up the dir processor as above and unpack into the dir.
Using GNU Parallel as dir processor has the same limitations as using Using GNU B<parallel> as dir processor has the same limitations as
GNU Parallel as queue system/batch manager. using GNU B<parallel> as queue system/batch manager.
=head1 EXAMPLE: Locate the missing package =head1 EXAMPLE: Locate the missing package
@ -4551,7 +4576,7 @@ Options to pass on to B<rsync>. Defaults to: -rlDzR.
=item $PARALLEL_SHELL =item $PARALLEL_SHELL
Use this shell for the commands run by GNU Parallel: Use this shell for the commands run by GNU B<parallel>:
=over 2 =over 2
@ -4561,7 +4586,7 @@ $PARALLEL_SHELL. If undefined use:
=item * =item *
The shell that started GNU Parallel. If that cannot be determined: The shell that started GNU B<parallel>. If that cannot be determined:
=item * =item *