niceload: Wrong regexp for loadaverage on MacOS X. Force LANG=C.

This commit is contained in:
Ole Tange 2016-08-03 23:42:15 +02:00
parent efcfefedf0
commit 769d2706f2
6 changed files with 145 additions and 56 deletions

View file

@ -219,30 +219,21 @@ cc:Tim Cuthbertson <tim3d.junk@gmail.com>,
Ryoichiro Suzuki <ryoichiro.suzuki@gmail.com>,
Jesse Alama <jesse.alama@gmail.com>
Subject: GNU Parallel 20160722 ('Brexit') released <<[stable]>>
Subject: GNU Parallel 20160722 ('Munich/Erdogan') released <<[stable]>>
GNU Parallel 20160722 ('Brexit') <<[stable]>> has been released. It is available for download at: http://ftp.gnu.org/gnu/parallel/
GNU Parallel 20160722 ('Munich/Erdogan') <<[stable]>> has been released. It is available for download at: http://ftp.gnu.org/gnu/parallel/
<<No new functionality was introduced so this is a good candidate for a stable release.>>
Haiku of the month:
Pipes are fast and good.
Use them in your programs, too.
Use GNU Parallel
<<>>
-- Ole Tange
New in this release:
* env_parallel is now ready for wider testing. It is still beta quality.
* GNU Parallel was cited in: Exome sequencing of geographically diverse barley landraces and wild relatives gives insights into environmental adaptation http://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3612.html?WT.feed_name=subjects_genetics#references
* env_parallel is heavily modified for all shells and testing has been increased.
* Selectively choosing what to export using --env now works for env_parallel (bash, csh, fish, ksh, pdksh, tcsh, zsh).
* --round-robin now gives more work to a job that processes faster instead of same amount to all jobs.
* --pipepart works on block devices on GNU/Linux.
* <<Possibly http://link.springer.com/chapter/10.1007%2F978-3-319-22053-6_46>>
@ -270,31 +261,19 @@ for Big Data Applications https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumb
* <<link No citation: Next-generation TCP for ns-3 simulator http://www.sciencedirect.com/science/article/pii/S1569190X15300939>>
* <<link No citation: Scalable metagenomics alignment research tool (SMART): a scalable, rapid, and complete search heuristic for the classification of metagenomic sequences from complex sequence populations http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-1159-6#Bib1>>
* <<No citation: Argumentation Models for Cyber Attribution http://arxiv.org/pdf/1607.02171.pdf>>
* <<Possible: http://link.springer.com/article/10.1007/s12021-015-9290-5 http://link.springer.com/protocol/10.1007/978-1-4939-3578-9_14>>
* GNU Parallel was cited in: HybPiper: Extracting Coding Sequence and Introns for Phylogenetics from High-Throughput Sequencing Reads Using Target Enrichment http://www.bioone.org/doi/full/10.3732/apps.1600016
* Easy parallelization with GNU parallel http://mpharrigan.com/2016/08/02/parallel.html
* GNU Parallel was cited in: StrAuto - Automation and Parallelization of STRUCTURE Analysis http://www.crypticlineage.net/download/strauto/strauto_doc.pdf
* Facebook V: Predicting Check Ins, Winner's Interview: 2nd Place, Markus Kliegl http://blog.kaggle.com/2016/08/02/facebook-v-predicting-check-ins-winners-interview-2nd-place-markus-kliegl/
* GNU Parallel was cited in: Tools and techniques for computational reproducibility http://gigascience.biomedcentral.com/articles/10.1186/s13742-016-0135-4
* Parallel import http://www.manitou-mail.org/blog/2016/07/parallel-import/
* GNU Parallel was cited in: FlashPCA: fast sparse canonical correlation analysis of genomic data http://biorxiv.org/content/biorxiv/suppl/2016/04/06/047217.DC1/047217-1.pdf
* GNU Parallel was cited in: Computational Design of DNA-Binding Proteins http://link.springer.com/protocol/10.1007/978-1-4939-3569-7_16
* GNU Parallel was cited in: Math Indexer and Searcher under the Hood: Fine-tuning Query Expansion and Unification Strategies http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings12/pdf/ntcir/MathIR/05-NTCIR12-MathIR-RuzickaM.pdf
* GNU Parallel was cited in: The Evolution and Fate of Super-Chandrasekhar Mass White Dwarf Merger Remnants http://arxiv.org/pdf/1606.02300.pdf
* GNU Parallel was cited in: Evaluation of Coastal Scatterometer Products https://mdc.coaps.fsu.edu/scatterometry/meeting/docs/2016/Thu_AM/coastal-poster.pdf
* GNU Parallel was used in: https://github.com/splitice/bulkdnsblcheck
* The iconv slurp misfeature http://www.openfusion.net/linux/iconv_slurp_misfeature
* แบบว่า CPU เหลือ https://veer66.wordpress.com/2016/06/15/gnu-parallel/
* Large file batch processing using NodeJs and GNU Parallel http://www.zacorndorff.com/2016/07/27/large-file-batch-processing-using-nodejs-and-gnu-parallel/
* Bug fixes and man page updates.

View file

@ -24,7 +24,7 @@
use strict;
use Getopt::Long;
$Global::progname="niceload";
$Global::version = 20160722;
$Global::version = 20160724;
Getopt::Long::Configure("bundling","require_order");
get_options_from_array(\@ARGV) || die_usage();
if($opt::version) {
@ -1005,7 +1005,7 @@ sub load_status_linux {
::die_bug("proc_loadavg");
}
close IN;
} elsif (open(IN,"uptime|")) {
} elsif (open(IN,"LANG=C uptime|")) {
my $upString = <IN>;
if($upString =~ m/averages?.\s*(\d+\.\d+)/) {
$loadavg = $1;
@ -1019,7 +1019,7 @@ sub load_status_linux {
sub load_status_darwin {
my $loadavg = `sysctl vm.loadavg`;
if($loadavg =~ /vm\.loadavg: { ([0-9.]+) ([0-9.]+) ([0-9.]+) }/) {
if($loadavg =~ /vm\.loadavg: \{ ([0-9.]+) ([0-9.]+) ([0-9.]+) \}/) {
$loadavg = $1;
} elsif (open(IN,"LANG=C uptime|")) {
my $upString = <IN>;

View file

@ -32,9 +32,9 @@ run 1 second, suspend (3.00-1.00) seconds, run 1 second, suspend
=over 9
=item B<-B> (beta testing)
=item B<-B>
=item B<--battery> (beta testing)
=item B<--battery>
Suspend if the system is running on battery. Shorthand for: -l -1 --sensor 'cat /sys/class/power_supply/BAT0/status /proc/acpi/battery/BAT0/state 2>/dev/null |grep -i -q discharging; echo $?'
@ -102,12 +102,12 @@ B<--noswap> is over limit if the system is swapping both in and out.
B<--noswap> will set both B<--start-noswap> and B<run-noswap>.
=item B<--net> (beta testing)
=item B<--net>
Shorthand for B<--nethops 3>.
=item B<--nethops> I<h> (beta testing)
=item B<--nethops> I<h>
Network nice. Pause if the internet connection is overloaded.
@ -140,9 +140,9 @@ Process ID of process to suspend. You can specify multiple process IDs
with multiple B<-p> I<PID>.
=item B<--prg> I<program> (beta testing)
=item B<--prg> I<program>
=item B<--program> I<program> (beta testing)
=item B<--program> I<program>
Name of running program to suspend. You can specify multiple programs
with multiple B<--prg> I<program>. If no processes with the name

View file

@ -632,9 +632,7 @@ The variable '_' is special. It will copy all exported environment
variables except for the ones mentioned in ~/.parallel/ignored_vars.
To copy the full environment (both exported and not exported
variables, arrays, and functions) use B<env_parallel> as described
under the option I<command>.
variables, arrays, and functions) use B<env_parallel>.
See also: B<--record-env>.
@ -2512,7 +2510,7 @@ B<--env>:
If your environment (aliases, variables, and functions) is small you
can copy the full environment without having to B<export -f>
anything. See B<env_parallel> earlier in the man page.
anything. See B<env_parallel>.
=head1 EXAMPLE: Function tester
@ -2579,16 +2577,16 @@ foo) you can do:
parallel --plus 'mkdir {..}; tar -C {..} -xf {}' ::: *.tar.gz
=head1 EXAMPLE: Download 10 images for each of the past 30 days
=head1 EXAMPLE: Download 24 images for each of the past 30 days
Let us assume a website stores images like:
http://www.example.com/path/to/YYYYMMDD_##.jpg
where YYYYMMDD is the date and ## is the number 01-10. This will
where YYYYMMDD is the date and ## is the number 01-24. This will
download images for the past 30 days:
parallel wget http://www.example.com/path/to/'$(date -d "today -{1} days" +%Y%m%d)_{2}.jpg' ::: $(seq 30) ::: $(seq -w 10)
parallel wget http://www.example.com/path/to/'$(date -d "today -{1} days" +%Y%m%d)_{2}.jpg' ::: $(seq 30) ::: $(seq -w 24)
B<$(date -d "today -{1} days" +%Y%m%d)> will give the dates in
YYYYMMDD with B<{1}> days subtracted.
@ -4383,8 +4381,7 @@ support running jobs on remote computers.
B<prll> encourages using BASH aliases and BASH functions instead of
scripts. GNU B<parallel> supports scripts directly, functions if they
are exported using B<export -f>, and aliases if using B<env_parallel>
described earlier.
are exported using B<export -f>, and aliases if using B<env_parallel>.
B<prll> generates a lot of status information on stderr (standard
error) which makes it harder to use the stderr (standard error) output
@ -4729,6 +4726,66 @@ B<4> find . -name '*.bmp' | jobflow -threads=8 -exec bmp2jpeg {.}.bmp {.}.jpg
B<4> find . -name '*.bmp' | parallel -j8 bmp2jpeg {.}.bmp {.}.jpg
=head2 DIFFERENCES BETWEEN gargs AND GNU Parallel
B<gargs> can run multiple jobs in parallel.
It caches output in memory. This causes it to be extremely slow when
the output is larger than the physical RAM, and can cause the system
to run out of memory.
See more details on this in B<man parallel_design>.
Output to stderr (standard error) is changed if the command fails.
Here are the two examples from B<gargs> website.
B<1> seq 12 -1 1 | gargs -p 4 -n 3 "sleep {0}; echo {1} {2}"
B<1> seq 12 -1 1 | parallel -P 4 -n 3 "sleep {1}; echo {2} {3}"
B<2> cat t.txt | gargs --sep "\s+" -p 2 "echo '{0}:{1}-{2}' full-line: \'{}\'"
B<2> cat t.txt | parallel --colsep "\\s+" -P 2 "echo '{1}:{2}-{3}' full-line: \'{}\'"
=head2 DIFFERENCES BETWEEN orgalorg AND GNU Parallel
B<orgalorg> can run the same job on multiple machines. This is related
to B<--onall> and B<--nonall>.
B<orgalorg> supports entering the SSH password - provided it is the
same for all servers. GNU B<parallel> advocates using B<ssh-agent>
instead, but it is possible to emulate B<orgalorg>'s behavior by
setting SSHPASS and by using B<--ssh "sshpass ssh">.
To make the emulation easier, make a simple alias:
alias par_emul="parallel -j0 --ssh 'sshpass ssh' --nonall --tag --linebuffer"
If you want to supply a password run:
SSHPASS=`ssh-askpass`
or set the password directly:
SSHPASS=P4$$w0rd!
If the above is set up you can then do:
orgalorg -o frontend1 -o frontend2 -p -C uptime
par_emul -S frontend1 -S frontend2 uptime
orgalorg -o frontend1 -o frontend2 -p -C top -bid 1
par_emul -S frontend1 -S frontend2 top -bid 1
orgalorg -o frontend1 -o frontend2 -p -er /tmp -n 'md5sum /tmp/bigfile' -S bigfile
par_emul -S frontend1 -S frontend2 --basefile bigfile --workdir /tmp md5sum /tmp/bigfile
B<orgalorg> has a progress indicator for the transferring of a
file. GNU B<parallel> does not.
=head2 DIFFERENCES BETWEEN ClusterSSH AND GNU Parallel
@ -4834,8 +4891,8 @@ or:
it may be because I<command> is not known, but it could also be
because I<command> is an alias or a function. If it is a function you
need to B<export -f> the function first. An alias will only work if you use
B<env_parallel> described earlier.
need to B<export -f> the function first. An alias will only work if
you use B<env_parallel>.
=head1 REPORTING BUGS

View file

@ -543,7 +543,7 @@ The wrapper looks like this:
Transferring of variables and functions given by B<--env> is done by
running a Perl script remotely that calls the actual command. The Perl
script sets B<$ENV{>I<variable>B<}> to the correct value before
exec'ing the a shell that runs the function definition followed by the
exec'ing a shell that runs the function definition followed by the
actual command.
The function B<env_parallel> copies the full current environment into
@ -743,10 +743,63 @@ not need to sync them to disk.
It gives the odd situation that a disk can be fully used, but there
are no visible files on it.
=head3 Comparing to buffering in memory
B<gargs> is a parallelizing tool that buffers in memory. It is
therefore a useful way of comparing the advantages and disadvantages.
On an system with 6 GB RAM free and 6 GB free swap these were tested
with different sizes:
echo /dev/zero | gargs "head -c $size {}" >/dev/null
echo /dev/zero | parallel "head -c $size {}" >/dev/null
The results are here:
JobRuntime Command
0.344 parallel_test 1M
0.362 parallel_test 10M
0.640 parallel_test 100M
9.818 parallel_test 1000M
23.888 parallel_test 2000M
30.217 parallel_test 2500M
30.963 parallel_test 2750M
34.648 parallel_test 3000M
43.302 parallel_test 4000M
55.167 parallel_test 5000M
67.493 parallel_test 6000M
178.654 parallel_test 7000M
204.138 parallel_test 8000M
230.052 parallel_test 9000M
255.639 parallel_test 10000M
757.981 parallel_test 30000M
0.537 gargs_test 1M
0.292 gargs_test 10M
0.398 gargs_test 100M
3.456 gargs_test 1000M
8.577 gargs_test 2000M
22.705 gargs_test 2500M
123.076 gargs_test 2750M
89.866 gargs_test 3000M
291.798 gargs_test 4000M
GNU B<parallel> is pretty much limited by the speed of the disk: Up to
6 GB data is written to disk but cached, so reading is fast. Above 6
GB data are both written and read from disk. When the 30000MB job is
running, the system is slow, but not completely unusable: If you are
not using the disk, you almost do not feel it.
B<gargs> hits a wall around 2500M. Then the system starts swapping
like crazy and is completely unusable. At 5000M it goes out of memory.
You can make GNU B<parallel> behave similar to B<gargs> if you point
$TMPDIR to a tmpfs-filesystem: It will be faster for small outputs,
but kill your system for larger outputs.
=head2 Disk full
GNU B<parallel> buffers on disk. If the disk is full data may be
GNU B<parallel> buffers on disk. If the disk is full, data may be
lost. To check if the disk is full GNU B<parallel> writes a 8193 byte
file every second. If this file is written successfully, it is removed
immediately. If it is not written successfully, the disk is full. The
@ -758,7 +811,7 @@ systems, whereas 8193 did the correct thing on all tested filesystems.
The shorthands for replacement strings make a command look more
cryptic. Different users will need different replacement
strings. Instead of inventing more shorthands you get more more
strings. Instead of inventing more shorthands you get more
flexible replacement strings if they can be programmed by the user.
The language Perl was chosen because GNU B<parallel> is written in
@ -939,7 +992,7 @@ was obsoleted 20130222 and removed one year later.
Until 20150122 variables and functions were transferred by looking at
$SHELL to see whether the shell was a B<*csh> shell. If so the
variables would be set using B<setenv>. Otherwise they would be set
using B<=>. The caused the content of the variable to be repeated:
using B<=>. This caused the content of the variable to be repeated:
echo $SHELL | grep "/t\{0,1\}csh" > /dev/null && setenv VAR foo ||
export VAR=foo

View file

@ -1281,7 +1281,7 @@ B<--resume-failed> reads the commands from the command line (and
ignores the commands in the joblog), B<--retry-failed> ignores the
command line and reruns the commands mentioned in the joblog.
parallel --resume-failed --joblog /tmp/log
parallel --retry-failed --joblog /tmp/log
cat /tmp/log
Output: