Fixed bug #53864: env_parallel + parset as part of parallel --embed.

2024-11-21 21:47:54 +00:00 · 2018-05-08 23:16:48 +02:00 · 2018-05-08 23:16:48 +02:00 · a050662275
parent 60e45b20eb
commit a050662275
8 changed files with 119 additions and 70 deletions
--- a/27
+++ b/27
@ -1,24 +1,21 @@
 When using programs that use GNU Parallel to process data for publication please cite:

-@article{Tange2011a,
-  title = {GNU Parallel - The Command-Line Power Tool},
-  author = {O. Tange},
-  address = {Frederiksberg, Denmark},
-  journal = {;login: The USENIX Magazine},
-  month = {Feb},
-  number = {1},
-  volume = {36},
-  url = {http://www.gnu.org/s/parallel},
-  year = {2011},
-  pages = {42-47}
-  doi = {10.5281/zenodo.16303}
+@book{tange_ole_2018_1146014,
+      author       = {Tange, Ole},
+      title        = {GNU Parallel 2018},
+      publisher    = {Ole Tange},
+      month        = Mar,
+      year         = 2018,
+      ISBN         = {9781387509881},
+      doi          = {10.5281/zenodo.1146014},
+      url          = {https://doi.org/10.5281/zenodo.1146014}
 }

-(Feel free to use \nocite{Tange2011a})
+(Feel free to use \nocite{tange_ole_2018_1146014})

 Or:

-  O. Tange (2011): GNU Parallel - The Command-Line Power Tool, 
-  ;login: The USENIX Magazine, February 2011:42-47.
+  O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
+  DOI https://doi.org/10.5281/zenodo.11460

 This helps funding further development.
--- a/4
+++ b/4
@ -93,8 +93,8 @@ will love you for it.
 When using programs that use GNU Parallel to process data for
 publication please cite:

-    O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
-    ;login: The USENIX Magazine, February 2011:42-47.
+    O. Tange (2018): GNU Parallel 2018, Mar 2018, ISBN 9781387509881,
+    DOI https://doi.org/10.5281/zenodo.11460


 = New versions =
--- a/doc/release_new_version
+++ b/doc/release_new_version
@ -199,7 +199,7 @@ to:parallel@gnu.org, bug-parallel@gnu.org
 stable-bcc: Jesse Alama <jessealama@fastmail.fm>


-Subject: GNU Parallel 20180522 ('') released <<[stable]>>
+Subject: GNU Parallel 20180522 ('Korea') released <<[stable]>>

 GNU Parallel 20180522 ('') <<[stable]>> has been released. It is available for download at: http://ftpmirror.gnu.org/parallel/

--- a/src/parallel
+++ b/src/parallel
@ -4331,6 +4331,17 @@ sub embed {
 	# Read the source from $0
 	my @source = <$fh>;
 	my $user = $ENV{LOGNAME} || $ENV{USERNAME} || $ENV{USER};
+	my @env_parallel_source = ();
+	my $shell = $Global::shell;
+	$shell =~ s:.*/::;
+	for(which("env_parallel.$shell")) {
+	    -r $_ or next;
+	    # Read the source of env_parallel.shellname
+	    open(my $env_parallel_source_fh, $_) || die;
+	    @env_parallel_source = <$env_parallel_source_fh>;
+	    close $env_parallel_source_fh;
+	    last;
+	}
 	print "#!$Global::shell

 # Copyright (C) 2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,
@ -4390,9 +4401,16 @@ parallel() {
    # Read the source from the fifo
    perl $_fifo_with_parallel_source "$@"
 }
+!,
+	@env_parallel_source,
+	q!

-# This will call the function above
+# This will call the functions above
 parallel -k echo ::: Put your code here
+env_parallel --session
+env_parallel -k echo ::: Put your code here
+parset p,y,c,h -k echo ::: Put your code here
+echo $p $y $c $h
 !;
    } else {
 	::error("Cannot open $0");
--- a/src/parallel_alternatives.pod
+++ b/src/parallel_alternatives.pod
@ -162,26 +162,30 @@ the last half of the line is from another process. The example
 B<Parallel grep> cannot be done reliably with B<xargs> because of
 this. To see this in action try:

-  parallel perl -e '\$a=\"1{}\"x10000000\;print\ \$a,\"\\n\"' '>' {} \
-    ::: a b c d e f
-  ls -l a b c d e f
-  parallel -kP4 -n1 grep 1 > out.par ::: a b c d e f
-  echo a b c d e f | xargs -P4 -n1 grep 1 > out.xargs-unbuf
-  echo a b c d e f | \
-    xargs -P4 -n1 grep --line-buffered 1 > out.xargs-linebuf
-  echo a b c d e f | xargs -n1 grep 1 > out.xargs-serial
-  ls -l out*
-  md5sum out*
+  parallel perl -e '\$a=\"1\".\"{}\"x10000000\;print\ \$a,\"\\n\"' \
+    '>' {} ::: a b c d e f g h
+  # Serial = no mixing = the wanted result
+  # 'tr -s a-z' squeezes repeating letters into a single letter
+  echo a b c d e f g h | xargs -P1 -n1 grep 1 | tr -s a-z
+  # Compare to 8 jobs in parallel
+  parallel -kP8 -n1 grep 1 ::: a b c d e f g h | tr -s a-z
+  echo a b c d e f g h | xargs -P8 -n1 grep 1 | tr -s a-z
+  echo a b c d e f g h | xargs -P8 -n1 grep --line-buffered 1 | \
+    tr -s a-z

 Or try this:

  slow_seq() {
+    echo Count to "$@"
    seq "$@" |
      perl -ne '$|=1; for(split//){ print; select($a,$a,$a,0.100);}'
  }
  export -f slow_seq
-  seq 5 | xargs -n1 -P0 -I {} bash -c 'slow_seq {}'
-  seq 5 | parallel -P0 slow_seq {}
+  # Serial = no mixing = the wanted result
+  seq 8 | xargs -n1 -P1 -I {} bash -c 'slow_seq {}'
+  # Compare to 8 jobs in parallel
+  seq 8 | parallel -P8 slow_seq {}
+  seq 8 | xargs -n1 -P8 -I {} bash -c 'slow_seq {}'

 B<xargs> has no support for keeping the order of the output, therefore
 if running jobs in parallel using B<xargs> the output of the second
@ -201,7 +205,8 @@ composed commands and redirection require using B<bash -c>.
  ls | parallel "wc {} >{}.wc"
  ls | parallel "echo {}; ls {}|wc"

-becomes (assuming you have 8 cores)
+becomes (assuming you have 8 cores and that none of the file names
+contain space, " or ').

  ls | xargs -d "\n" -P8 -I {} bash -c "wc {} >{}.wc"
  ls | xargs -d "\n" -P8 -I {} bash -c "echo {}; ls {}|wc"
@ -791,12 +796,14 @@ https://github.com/rofl0r/jobflow

 B<gargs> can run multiple jobs in parallel.

-It caches output in memory. This causes it to be extremely slow when
-the output is larger than the physical RAM, and can cause the system
-to run out of memory.
+Older versions cache output in memory. This causes it to be extremely
+slow when the output is larger than the physical RAM, and can cause
+the system to run out of memory.

 See more details on this in B<man parallel_design>.

+Newer versions cache output in files, but leave files in $TMPDIR if it
+is killed.

 Output to stderr (standard error) is changed if the command fails.

--- a/testsuite/tests-to-run/parallel-local-100s.sh
+++ b/testsuite/tests-to-run/parallel-local-100s.sh
@ -72,41 +72,46 @@ par_over_4GB() {
 	nice md5sum
 }

-
-
 par_mem_leak() {
    echo "### test for mem leak"

+    export parallel=parallel
    no_mem_leak() {
-	measure() {
-	    # Input:
-	    #   $1 = iterations
-	    #   $2 = sleep 1 sec for every $2
-	    seq $1 | ramusage parallel -u sleep '{= $_=$_%'$2'?0:1 =}'
+	run_measurements() {
+	    from=$1
+	    to=$2
+	    pause_every=$3
+	    measure() {
+		# Input:
+		#   $1 = iterations
+		#   $2 = sleep 1 sec for every $2
+		seq $1 | ramusage $parallel -u sleep '{= $_=$_%'$2'?0:1 =}'
+	    }
+	    export -f measure
+
+	    seq $from $to | $parallel measure {} $pause_every |
+    		sort -n
 	}
-	export -f measure

 	# Return false if leaking
-	max1000=$(parallel measure {} 100000 ::: 1000 1000 1000 1000 1000 1000 1000 1000 |
-    			 sort -n | tail -n 1)
-	min30000=$(parallel measure {} 100000 ::: 3000 3000 3000 |
-    			  sort -n | head -n 1)
+	# Normal: 16940-17320
+	max1000=$(run_measurements 1000 1007 100000 | tail -n1)
+	min30000=$(run_measurements 15000 15004 100000 | head -n1)
 	if [ $max1000 -gt $min30000 ] ; then
+	    echo Probably no leak $max1000 -gt $min30000
+	    return 0
+	else
+	    echo Probably leaks $max1000 not -gt $min30000
 	    # Make sure there are a few sleeps
-	    max1000=$(parallel measure {} 100 ::: 1000 1000 1000 1000 1000 1000 1000 1000 |
-			     sort -n | tail -n 1)
-	    min30000=$(parallel measure {} 100 ::: 3000 3000 3000 |
-			      sort -n | head -n 1)
+	    max1000=$(run_measurements 1001 1007 100 | tail -n1)
+	    min30000=$(run_measurements 30000 30004 100 | head -n1)
 	    if [ $max1000 -gt $min30000 ] ; then
-		echo $max1000 -gt $min30000 = no leak
+		echo $max1000 -gt $min30000 = very likely no leak
 		return 0
 	    else
-		echo not $max1000 -gt $min30000 = possible leak
+		echo not $max1000 -gt $min30000 = very likely leak
 		return 1
 	    fi
-	else
-	    echo not $max1000 -gt $min30000 = possible leak
-	    return 1
 	fi
    }

--- a/testsuite/tests-to-run/parallel-local-ssh9.sh
+++ b/testsuite/tests-to-run/parallel-local-ssh9.sh
@ -13,7 +13,7 @@ myvar=OK
 parallel echo ::: parallel_OK
 PATH=/usr/sbin:/usr/bin:/sbin:/bin
 # Do not look for parallel in /usr/local/bin
-. \`which env_parallel.ash\`
+#. \`which env_parallel.ash\`
 }
    ' | tac > parallel-embed
    chmod +x parallel-embed
@ -37,7 +37,7 @@ myvar=OK
 parallel echo ::: parallel_OK
 PATH=/usr/sbin:/usr/bin:/sbin:/bin
 # Do not look for parallel in /usr/local/bin
-. \`which env_parallel.bash\`
+#. \`which env_parallel.bash\`
 }
    ' | tac > parallel-embed
    chmod +x parallel-embed
@ -69,12 +69,12 @@ myvar=OK
 parallel echo ::: parallel_OK
 PATH=/usr/sbin:/usr/bin:/sbin:/bin
 # Do not look for parallel in /usr/local/bin
-. \`which env_parallel.ksh\`
+#. \`which env_parallel.ksh\`
 }
    ' | tac > parallel-embed
    chmod +x parallel-embed
    ./parallel-embed
-#    rm parallel-embed
+    rm parallel-embed
 _EOF
  )
  ssh ksh@lo "$myscript"
@ -93,12 +93,12 @@ myvar=OK
 parallel echo ::: parallel_OK
 PATH=/usr/sbin:/usr/bin:/sbin:/bin
 # Do not look for parallel in /usr/local/bin
-. \`which env_parallel.sh\`
+#. \`which env_parallel.sh\`
 }
    ' | tac > parallel-embed
    chmod +x parallel-embed
    ./parallel-embed
-#    rm parallel-embed
+    rm parallel-embed
 _EOF
  )
  ssh sh@lo "$myscript"
@ -121,7 +121,6 @@ myvar=OK
 parallel echo ::: parallel_OK
 PATH=/usr/sbin:/usr/bin:/sbin:/bin
 # Do not look for parallel in /usr/local/bin
-. \`which env_parallel.zsh\`
 }
    ' | tac > parallel-embed
    chmod +x parallel-embed
@ -138,4 +137,5 @@ export -f $(compgen -A function | grep par_)
 compgen -A function | grep par_ | sort -r |
 #    parallel --joblog /tmp/jl-`basename $0` --delay $D -j$P --tag -k '{} 2>&1'
    parallel --joblog /tmp/jl-`basename $0` -j200% --tag -k '{} 2>&1' |
-    perl -pe 's/line \d\d\d:/line XXX:/'
+    perl -pe 's/line \d\d\d+:/line XXX:/' |
+    perl -pe 's/\[\d\d\d+\]:/[XXX]:/'
--- a/testsuite/wanted-results/parallel-local-ssh9
+++ b/testsuite/wanted-results/parallel-local-ssh9
@ -6,10 +6,7 @@ par_zsh_embed	your
 par_zsh_embed	code
 par_zsh_embed	here
 par_zsh_embed	parallel_OK
-par_zsh_embed	/home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
 par_zsh_embed	env_parallel --env OK
-par_zsh_embed	/home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
-par_zsh_embed	/home/zsh/.zshenv:.:3: no such file or directory: env_parallel.zsh
 par_zsh_embed	_which:12: argument list too long: perl
 par_zsh_embed	env_parallel: Error: Your environment is too big.
 par_zsh_embed	env_parallel: Error: You can try 2 different approaches:
@ -19,6 +16,11 @@ par_zsh_embed	env_parallel: Error:      env_parallel --record-env
 par_zsh_embed	env_parallel: Error:    And then use '--env _'
 par_zsh_embed	env_parallel: Error: For details see: man env_parallel
 par_zsh_embed	ParsetOK
+par_zsh_embed	Put
+par_zsh_embed	your
+par_zsh_embed	code
+par_zsh_embed	here
+par_zsh_embed	Put your code here
 par_tcsh_embed	Not implemented
 par_sh_embed	--embed
 par_sh_embed	Redirect the output to a file and add your changes at the end:
@ -31,6 +33,11 @@ par_sh_embed	parallel_OK
 par_sh_embed	env_parallel --env OK
 par_sh_embed	env_parallel_OK
 par_sh_embed	ParsetOK
+par_sh_embed	Put
+par_sh_embed	your
+par_sh_embed	code
+par_sh_embed	here
+par_sh_embed	Put your code here
 par_ksh_embed	--embed
 par_ksh_embed	Redirect the output to a file and add your changes at the end:
 par_ksh_embed	  /usr/local/bin/parallel --embed > new_script
@ -40,7 +47,7 @@ par_ksh_embed	code
 par_ksh_embed	here
 par_ksh_embed	parallel_OK
 par_ksh_embed	env_parallel --env OK
-par_ksh_embed	./parallel-embed[118]: perl: /usr/bin/perl: cannot execute [Argument list too long]
+par_ksh_embed	./parallel-embed[XXX]: perl: /usr/bin/perl: cannot execute [Argument list too long]
 par_ksh_embed	env_parallel: Error: Your environment is too big.
 par_ksh_embed	env_parallel: Error: You can try 2 different approaches:
 par_ksh_embed	env_parallel: Error: 1. Use --env and only mention the names to copy.
@ -49,6 +56,11 @@ par_ksh_embed	env_parallel: Error:      env_parallel --record-env
 par_ksh_embed	env_parallel: Error:    And then use '--env _'
 par_ksh_embed	env_parallel: Error: For details see: man env_parallel
 par_ksh_embed	ParsetOK
+par_ksh_embed	Put
+par_ksh_embed	your
+par_ksh_embed	code
+par_ksh_embed	here
+par_ksh_embed	Put your code here
 par_fish_embed	Not implemented
 par_csh_embed	Not implemented
 par_bash_embed	--embed
@ -60,7 +72,7 @@ par_bash_embed	code
 par_bash_embed	here
 par_bash_embed	parallel_OK
 par_bash_embed	env_parallel --env OK
-par_bash_embed	/usr/local/bin/env_parallel.bash: line XXX: /usr/bin/perl: Argument list too long
+par_bash_embed	./parallel-embed: line XXX: /usr/bin/perl: Argument list too long
 par_bash_embed	env_parallel: Error: Your environment is too big.
 par_bash_embed	env_parallel: Error: You can try 2 different approaches:
 par_bash_embed	env_parallel: Error: 1. Use --env and only mention the names to copy.
@ -69,6 +81,11 @@ par_bash_embed	env_parallel: Error:      env_parallel --record-env
 par_bash_embed	env_parallel: Error:    And then use '--env _'
 par_bash_embed	env_parallel: Error: For details see: man env_parallel
 par_bash_embed	ParsetOK
+par_bash_embed	Put
+par_bash_embed	your
+par_bash_embed	code
+par_bash_embed	here
+par_bash_embed	Put your code here
 par_ash_embed	--embed
 par_ash_embed	Redirect the output to a file and add your changes at the end:
 par_ash_embed	  /usr/local/bin/parallel --embed > new_script
@ -80,3 +97,8 @@ par_ash_embed	parallel_OK
 par_ash_embed	env_parallel --env OK
 par_ash_embed	env_parallel_OK
 par_ash_embed	ParsetOK
+par_ash_embed	Put
+par_ash_embed	your
+par_ash_embed	code
+par_ash_embed	here
+par_ash_embed	Put your code here