More examples

2024-12-22 12:47:54 +00:00 · 2010-08-14 20:39:33 +02:00 · 2010-08-14 20:39:33 +02:00 · a038ade0de
parent f888da9fdb
commit a038ade0de
3 changed files with 72 additions and 32 deletions
--- a/doc/FUTURE_IDEAS
+++ b/doc/FUTURE_IDEAS
@ -1,31 +1,16 @@
+Default sshloginfile ~/.parallel/sshloginfile
+--sshloginfile .. or -S .. means use default sshloginfile

+# Allow 7 to run. After then 7th is started, block untill one is dead
+parallel --mutex uniqidentifier -j7 command
+parallel --automutex  -j7 command
+mdm.screen find dir -execdir mdm-run cmd {} \;
+find dir -execdir parallel --automutex cmd {} \;
+getppid

-# Gzip all files in parallel
-parallel gzip ::: *
-
-# Convert *.wav to *.mp3 using LAME running one process per CPU core:
-parallel -j+0 lame {} -o {.}.mp3 ::: *.wav
-
-# Make an uncompressed version of all *.gz
-parallel zcat {} ">"{.} ::: *.gz
-
-# Recompress all .gz files using bzip2 running 1 job per CPU core:
-find . -name '*.gz' | parallel -j+0 "zcat {} | bzip2 >{.}.bz2 && rm {}"
-
-# Create a directory for each zip-file and unzip it in that dir
-parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
-
-# Convert all *.mp3 in subdirs to *.ogg running 
-#   one process per CPU core on local computer and server2
-find . -name '*.mp3' | parallel --trc {.}.ogg -j+0 -S server2,: \
-         'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg'
-
-# Run mycmd on column 1-3 of each row of TAB separated values
-parallel -a table_file.tsv --colsep '\t' mycmd -o {2} {3} -i {1}
-
-# Run traceroute in parallel, but keep the output order the same
-parallel -k traceroute ::: foss.org.my debian.org freenetproject.org
-
+fex syntax for splitting fields
+http://www.semicomplete.com/projects/fex/
+sql :foo 'select * from bar' | parallel --fex '|{1,2}' do_stuff {2} {1}


 Unittest: --colsep + multiple -a
@ -346,3 +331,30 @@ do not start new jobs. Print out the number of jobs waiting to
 complete on STDERR. Accept sig INT again to kill now. This seems to be
 hard, as all foreground processes get the INT from the shell.

+
+
+# Gzip all files in parallel
+parallel gzip ::: *
+
+# Convert *.wav to *.mp3 using LAME running one process per CPU core:
+parallel -j+0 lame {} -o {.}.mp3 ::: *.wav
+
+# Make an uncompressed version of all *.gz
+parallel zcat {} ">"{.} ::: *.gz
+
+# Recompress all .gz files using bzip2 running 1 job per CPU core:
+find . -name '*.gz' | parallel -j+0 "zcat {} | bzip2 >{.}.bz2 && rm {}"
+
+# Create a directory for each zip-file and unzip it in that dir
+parallel 'mkdir {.}; cd {.}; unzip ../{}' ::: *.zip
+
+# Convert all *.mp3 in subdirs to *.ogg running 
+#   one process per CPU core on local computer and server2
+find . -name '*.mp3' | parallel --trc {.}.ogg -j+0 -S server2,: \
+         'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg'
+
+# Run mycmd on column 1-3 of each row of TAB separated values
+parallel -a table_file.tsv --colsep '\t' mycmd -o {2} {3} -i {1}
+
+# Run traceroute in parallel, but keep the output order the same
+parallel -k traceroute ::: foss.org.my debian.org freenetproject.org
--- a/doc/release_new_version
+++ b/doc/release_new_version
@ -78,7 +78,8 @@ Newsgroups: comp.unix.shell,comp.unix.admin

 <<<<<
 to:parallel@gnu.org, bug-parallel@gnu.org, info-gnu@gnu.org, bug-directory@gnu.org
-cc:Peter Simons <simons@cryp.to>, Sandro Cazzaniga <kharec@mandriva.org>
+cc:Peter Simons <simons@cryp.to>, Sandro Cazzaniga <kharec@mandriva.org>,
+   Tim Cuthbertson <tim3d.junk@gmail.com>

 Subject: GNU Parallel 20100722 released

@ -88,13 +89,23 @@ download at: http://ftp.gnu.org/gnu/parallel/
 New in this release:

 * With --colsep a table can be used as input. Example:
-  cat table | parallel --colsep '\s+' echo col1 {1} col2 {2}
+  cat tab_sep_table | parallel --colsep '\t' echo col1 {1} col2 {2}

 * --trim can remove white space around arguments.

 * Zero install package. Thanks to Tim Cuthbertson <tim3d dot junk at
  gmail dot com>

+* OpenSUSE package. Thanks to Markus Ammer <mkmm at gmx-topmail dot
+  de>
+
+* Web review http://oentend.blogspot.com/2010/08/gnu-parallel.html
+  Thanks to Pavel Nuzhdin <pnzhdin at gmail dot com>
+
+* Web review http://psung.blogspot.com/2010/08/gnu-parallel.html
+  Thanks to Phil Sung
+
+
 = About GNU Parallel =

 GNU Parallel is a shell tool for executing jobs in parallel using one
--- a/src/parallel
+++ b/src/parallel
@ -815,11 +815,11 @@ B<ls | parallel -m mv {} destdir>

 To remove the files I<pict0000.jpg> .. I<pict9999.jpg> you could do:

-B<seq -f %04g 0 9999 | parallel rm pict{}.jpg>
+B<seq -w 0 9999 | parallel rm pict{}.jpg>

 You could also do:

-B<seq -f %04g 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm>
+B<seq -w 0 9999 | perl -pe 's/(.*)/pict$1.jpg/' | parallel -m rm>

 The first will run B<rm> 10000 times, while the last will only run
 B<rm> as many times needed to keep the command line length short
@ -827,7 +827,7 @@ enough to avoid B<Argument list too long> (it typically runs 1-2 times).

 You could also run:

-B<seq -f %04g 0 9999 | parallel -X rm pict{}.jpg>
+B<seq -w 0 9999 | parallel -X rm pict{}.jpg>

 This will also only run B<rm> as many times needed to keep the command
 line length short enough.
@ -925,6 +925,21 @@ foo) you can do:

 B<ls *.tar.gz| parallel -U {tar} 'echo {tar}|parallel "mkdir -p {.} ; tar -C {.} -xf {.}.tar.gz"'>

+=head1 EXAMPLE: Download 10 images for each of the past 30 days
+
+Let us assume a website stores images like:
+
+   http://www.website.com/path/to/YYYYMMDD_##.jpg
+
+where YYYYMMDD is the date and ## is the number 01-10.  This will
+generate the past 30 days as YYYYMMDD:
+
+B<seq 1 30 | parallel date -d '"today -{} days"' +%Y%m%d>
+
+Based on this we can let GNU B<parallel> generate 10 B<wget>s per day:
+
+B<I<the above> | parallel -I {o} seq -w 1 10 "|" parallel wget
+http://www.website.com/path/to/{o}_{}.jpg>

 =head1 EXAMPLE: Rewriting a for-loop and a while-loop

@ -1278,7 +1293,7 @@ the currently running jobs are finished before exiting.

 The environment variable $PARALLEL_PID is set by GNU B<parallel> and
 is visible to the jobs started from GNU B<parallel>. This makes it
-possible for the jobs to communicate directly to GNU <parallel>.
+possible for the jobs to communicate directly to GNU B<parallel>.

 B<Example:> If each of the jobs tests a solution and one of jobs finds
 the solution the job can tell GNU B<parallel> not to start more jobs
@ -1320,6 +1335,8 @@ The file ~/.parallelrc will be read if it exists. It should be
 formatted like the environment variable $PARALLEL. Lines starting with
 '#' will be ignored.

+Options on the command line takes precedence over the environment
+variable $PARALLEL which takes precedence over the file ~/.parallelrc.

 =head1 EXIT STATUS