Fixed bug #53864: env_parallel + parset as part of parallel --embed.

This commit is contained in:
Ole Tange 2018-05-08 23:16:48 +02:00
parent 60e45b20eb
commit a050662275
8 changed files with 119 additions and 70 deletions

View file

@ -1,24 +1,21 @@
When using programs that use GNU Parallel to process data for publication please cite:
@article{Tange2011a,
title = {GNU Parallel - The Command-Line Power Tool},
author = {O. Tange},
address = {Frederiksberg, Denmark},
journal = {;login: The USENIX Magazine},
month = {Feb},
number = {1},
volume = {36},
url = {http://www.gnu.org/s/parallel},
year = {2011},
pages = {42-47}
doi = {10.5281/zenodo.16303}
@book{tange_ole_2018_1146014,
author = {Tange, Ole},
title = {GNU Parallel 2018},
publisher = {Ole Tange},
month = Mar,
year = 2018,
ISBN = {9781387509881},
doi = {10.5281/zenodo.1146014},
url = {https://doi.org/10.5281/zenodo.1146014}
}
(Feel free to use \nocite{Tange2011a})
(Feel free to use \nocite{tange_ole_2018_1146014})
Or:
O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.
O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
DOI https://doi.org/10.5281/zenodo.11460
This helps funding further development.

4
README
View file

@ -93,8 +93,8 @@ will love you for it.
When using programs that use GNU Parallel to process data for
publication please cite:
O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.
O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
DOI https://doi.org/10.5281/zenodo.11460
= New versions =

View file

@ -199,7 +199,7 @@ to:parallel@gnu.org, bug-parallel@gnu.org
stable-bcc: Jesse Alama <jessealama@fastmail.fm>
Subject: GNU Parallel 20180522 ('') released <<[stable]>>
Subject: GNU Parallel 20180522 ('Korea') released <<[stable]>>
GNU Parallel 20180522 ('') <<[stable]>> has been released. It is available for download at: http://ftpmirror.gnu.org/parallel/

View file

@ -4331,6 +4331,17 @@ sub embed {
# Read the source from $0
my @source = <$fh>;
my $user = $ENV{LOGNAME} || $ENV{USERNAME} || $ENV{USER};
my @env_parallel_source = ();
my $shell = $Global::shell;
$shell =~ s:.*/::;
for(which("env_parallel.$shell")) {
-r $_ or next;
# Read the source of env_parallel.shellname
open(my $env_parallel_source_fh, $_) || die;
@env_parallel_source = <$env_parallel_source_fh>;
close $env_parallel_source_fh;
last;
}
print "#!$Global::shell
# Copyright (C) 2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,
@ -4390,9 +4401,16 @@ parallel() {
# Read the source from the fifo
perl $_fifo_with_parallel_source "$@"
}
!,
@env_parallel_source,
q!
# This will call the function above
# This will call the functions above
parallel -k echo ::: Put your code here
env_parallel --session
env_parallel -k echo ::: Put your code here
parset p,y,c,h -k echo ::: Put your code here
echo $p $y $c $h
!;
} else {
::error("Cannot open $0");

View file

@ -162,26 +162,30 @@ the last half of the line is from another process. The example
B<Parallel grep> cannot be done reliably with B<xargs> because of
this. To see this in action try:
parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
::: a b c d e f
ls -l a b c d e f
parallel -kP4 -n1 grep 1 > out.par ::: a b c d e f
echo a b c d e f | xargs -P4 -n1 grep 1 > out.xargs-unbuf
echo a b c d e f | \
xargs -P4 -n1 grep --line-buffered 1 > out.xargs-linebuf
echo a b c d e f | xargs -n1 grep 1 > out.xargs-serial
ls -l out*
md5sum out*
parallel perl -e '\$a=\"1\".\"{}\"x10000000\;print\ \$a,\"\\n\"' \
'>' {} ::: a b c d e f g h
# Serial = no mixing = the wanted result
# 'tr -s a-z' squeezes repeating letters into a single letter
echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
# Compare to 8 jobs in parallel
parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
tr -s a-z
Or try this:
slow_seq() {
echo Count to "$@"
seq "$@" |
perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
}
export -f slow_seq
seq 5 | xargs -n1 -P0 -I {} bash -c 'slow_seq {}'
seq 5 | parallel -P0 slow_seq {}
# Serial = no mixing = the wanted result
seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
# Compare to 8 jobs in parallel
seq 8 | parallel -P8 slow_seq {}
seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'
B<xargs> has no support for keeping the order of the output, therefore
if running jobs in parallel using B<xargs> the output of the second
@ -201,7 +205,8 @@ composed commands and redirection require using B<bash -c>.
ls | parallel "wc {} >{}.wc"
ls | parallel "echo {}; ls {}|wc"
becomes (assuming you have 8 cores)
becomes (assuming you have 8 cores and that none of the file names
contain space, " or ').
ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
@ -791,12 +796,14 @@ https://github.com/rofl0r/jobflow
B<gargs> can run multiple jobs in parallel.
It caches output in memory. This causes it to be extremely slow when
the output is larger than the physical RAM, and can cause the system
to run out of memory.
Older versions cache output in memory. This causes it to be extremely
slow when the output is larger than the physical RAM, and can cause
the system to run out of memory.
See more details on this in B<man parallel_design>.
Newer versions cache output in files, but leave files in $TMPDIR if it
is killed.
Output to stderr (standard error) is changed if the command fails.

View file

@ -72,41 +72,46 @@ par_over_4GB() {
nice md5sum
}
par_mem_leak() {
echo "### test for mem leak"
export parallel=parallel
no_mem_leak() {
run_measurements() {
from=$1
to=$2
pause_every=$3
measure() {
# Input:
# $1 = iterations
# $2 = sleep 1 sec for every $2
seq $1 | ramusage parallel -u sleep '{= $_=$_%'$2'?0:1 =}'
seq $1 | ramusage $parallel -u sleep '{= $_=$_%'$2'?0:1 =}'
}
export -f measure
seq $from $to | $parallel measure {} $pause_every |
sort -n
}
# Return false if leaking
max1000=$(parallel measure {} 100000 ::: 1000 1000 1000 1000 1000 1000 1000 1000 |
sort -n | tail -n 1)
min30000=$(parallel measure {} 100000 ::: 3000 3000 3000 |
sort -n | head -n 1)
# Normal: 16940-17320
max1000=$(run_measurements 1000 1007 100000 | tail -n1)
min30000=$(run_measurements 15000 15004 100000 | head -n1)
if [ $max1000 -gt $min30000 ] ; then
# Make sure there are a few sleeps
max1000=$(parallel measure {} 100 ::: 1000 1000 1000 1000 1000 1000 1000 1000 |
sort -n | tail -n 1)
min30000=$(parallel measure {} 100 ::: 3000 3000 3000 |
sort -n | head -n 1)
if [ $max1000 -gt $min30000 ] ; then
echo $max1000 -gt $min30000 = no leak
echo Probably no leak $max1000 -gt $min30000
return 0
else
echo not $max1000 -gt $min30000 = possible leak
echo Probably leaks $max1000 not -gt $min30000
# Make sure there are a few sleeps
max1000=$(run_measurements 1001 1007 100 | tail -n1)
min30000=$(run_measurements 30000 30004 100 | head -n1)
if [ $max1000 -gt $min30000 ] ; then
echo $max1000 -gt $min30000 = very likely no leak
return 0
else
echo not $max1000 -gt $min30000 = very likely leak
return 1
fi
else
echo not $max1000 -gt $min30000 = possible leak
return 1
fi
}

View file

@ -13,7 +13,7 @@ myvar=OK
parallel echo ::: parallel_OK
PATH=/usr/sbin:/usr/bin:/sbin:/bin
# Do not look for parallel in /usr/local/bin
. \`which env_parallel.ash\`
#. \`which env_parallel.ash\`
}
' | tac > parallel-embed
chmod +x parallel-embed
@ -37,7 +37,7 @@ myvar=OK
parallel echo ::: parallel_OK
PATH=/usr/sbin:/usr/bin:/sbin:/bin
# Do not look for parallel in /usr/local/bin
. \`which env_parallel.bash\`
#. \`which env_parallel.bash\`
}
' | tac > parallel-embed
chmod +x parallel-embed
@ -69,12 +69,12 @@ myvar=OK
parallel echo ::: parallel_OK
PATH=/usr/sbin:/usr/bin:/sbin:/bin
# Do not look for parallel in /usr/local/bin
. \`which env_parallel.ksh\`
#. \`which env_parallel.ksh\`
}
' | tac > parallel-embed
chmod +x parallel-embed
./parallel-embed
# rm parallel-embed
rm parallel-embed
_EOF
)
ssh ksh@lo "$myscript"
@ -93,12 +93,12 @@ myvar=OK
parallel echo ::: parallel_OK
PATH=/usr/sbin:/usr/bin:/sbin:/bin
# Do not look for parallel in /usr/local/bin
. \`which env_parallel.sh\`
#. \`which env_parallel.sh\`
}
' | tac > parallel-embed
chmod +x parallel-embed
./parallel-embed
# rm parallel-embed
rm parallel-embed
_EOF
)
ssh sh@lo "$myscript"
@ -121,7 +121,6 @@ myvar=OK
parallel echo ::: parallel_OK
PATH=/usr/sbin:/usr/bin:/sbin:/bin
# Do not look for parallel in /usr/local/bin
. \`which env_parallel.zsh\`
}
' | tac > parallel-embed
chmod +x parallel-embed
@ -138,4 +137,5 @@ export -f $(compgen -A function | grep par_)
compgen -A function | grep par_ | sort -r |
# parallel --joblog /tmp/jl-`basename $0` --delay $D -j$P --tag -k '{} 2>&1'
parallel --joblog /tmp/jl-`basename $0` -j200% --tag -k '{} 2>&1' |
perl -pe 's/line \d\d\d:/line XXX:/'
perl -pe 's/line \d\d\d+:/line XXX:/' |
perl -pe 's/\[\d\d\d+\]:/[XXX]:/'

View file

@ -6,10 +6,7 @@ par_zsh_embed your
par_zsh_embed code
par_zsh_embed here
par_zsh_embed parallel_OK
par_zsh_embed /home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
par_zsh_embed env_parallel --env OK
par_zsh_embed /home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
par_zsh_embed /home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
par_zsh_embed _which:12: argument list too long: perl
par_zsh_embed env_parallel: Error: Your environment is too big.
par_zsh_embed env_parallel: Error: You can try 2 different approaches:
@ -19,6 +16,11 @@ par_zsh_embed env_parallel: Error: env_parallel --record-env
par_zsh_embed env_parallel: Error: And then use '--env _'
par_zsh_embed env_parallel: Error: For details see: man env_parallel
par_zsh_embed ParsetOK
par_zsh_embed Put
par_zsh_embed your
par_zsh_embed code
par_zsh_embed here
par_zsh_embed Put your code here
par_tcsh_embed Not implemented
par_sh_embed --embed
par_sh_embed Redirect the output to a file and add your changes at the end:
@ -31,6 +33,11 @@ par_sh_embed parallel_OK
par_sh_embed env_parallel --env OK
par_sh_embed env_parallel_OK
par_sh_embed ParsetOK
par_sh_embed Put
par_sh_embed your
par_sh_embed code
par_sh_embed here
par_sh_embed Put your code here
par_ksh_embed --embed
par_ksh_embed Redirect the output to a file and add your changes at the end:
par_ksh_embed /usr/local/bin/parallel --embed > new_script
@ -40,7 +47,7 @@ par_ksh_embed code
par_ksh_embed here
par_ksh_embed parallel_OK
par_ksh_embed env_parallel --env OK
par_ksh_embed ./parallel-embed[118]: perl: /usr/bin/perl: cannot execute [Argument list too long]
par_ksh_embed ./parallel-embed[XXX]: perl: /usr/bin/perl: cannot execute [Argument list too long]
par_ksh_embed env_parallel: Error: Your environment is too big.
par_ksh_embed env_parallel: Error: You can try 2 different approaches:
par_ksh_embed env_parallel: Error: 1. Use --env and only mention the names to copy.
@ -49,6 +56,11 @@ par_ksh_embed env_parallel: Error: env_parallel --record-env
par_ksh_embed env_parallel: Error: And then use '--env _'
par_ksh_embed env_parallel: Error: For details see: man env_parallel
par_ksh_embed ParsetOK
par_ksh_embed Put
par_ksh_embed your
par_ksh_embed code
par_ksh_embed here
par_ksh_embed Put your code here
par_fish_embed Not implemented
par_csh_embed Not implemented
par_bash_embed --embed
@ -60,7 +72,7 @@ par_bash_embed code
par_bash_embed here
par_bash_embed parallel_OK
par_bash_embed env_parallel --env OK
par_bash_embed /usr/local/bin/env_parallel.bash: line XXX: /usr/bin/perl: Argument list too long
par_bash_embed ./parallel-embed: line XXX: /usr/bin/perl: Argument list too long
par_bash_embed env_parallel: Error: Your environment is too big.
par_bash_embed env_parallel: Error: You can try 2 different approaches:
par_bash_embed env_parallel: Error: 1. Use --env and only mention the names to copy.
@ -69,6 +81,11 @@ par_bash_embed env_parallel: Error: env_parallel --record-env
par_bash_embed env_parallel: Error: And then use '--env _'
par_bash_embed env_parallel: Error: For details see: man env_parallel
par_bash_embed ParsetOK
par_bash_embed Put
par_bash_embed your
par_bash_embed code
par_bash_embed here
par_bash_embed Put your code here
par_ash_embed --embed
par_ash_embed Redirect the output to a file and add your changes at the end:
par_ash_embed /usr/local/bin/parallel --embed > new_script
@ -80,3 +97,8 @@ par_ash_embed parallel_OK
par_ash_embed env_parallel --env OK
par_ash_embed env_parallel_OK
par_ash_embed ParsetOK
par_ash_embed Put
par_ash_embed your
par_ash_embed code
par_ash_embed here
par_ash_embed Put your code here