On Wed, Feb 26, 2014 at 8:07 PM, Rob Sargent <robjsargent@gmail.com> wrote:
I'm running GNU parallel 20131222 with --results outdir --jobs 4 and getting
4 processors running at capacity but only getting the stdout from two of the
jobs. It looks as if the other two are buffering the output since they are
gaining memory while the other two jobs are at constant memory. The stdout
for each will be ~2Gbytes.
Is this expected?
GNU Parallel should not increase memory usage: Buffering is done on
disk. But what you see is two jobs that choose for themselves to
buffer in memory, then GNU Parallel cannot force them to empty that
buffer. As part of the release procedure every version of GNU Parallel
is tested with output > 4GB, so GNU Parallel does not have a 2 GB
limit.
Can you reproduce the problem?
ยท A complete example that others can run that shows the
problem. This should preferably be small and simple. A
combination of yes, seq, cat, echo, and sleep can
reproduce most errors. If your example requires large
files, see if you can make them by something like seq
1000000 > file or yes | head -n 10000000 > file. If your
example requires remote execution, see if you can use
localhost - maybe using another login.
If you suspect the error is dependent on your environment
or distribution, please see if you can reproduce the error
on one of these VirtualBox images:
http://sourceforge.net/projects/virtualboximage/files/
/Ole