parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Out of memory when doing --pipe --block ?


From: Ole Tange
Subject: Re: Out of memory when doing --pipe --block ?
Date: Sat, 10 Feb 2018 23:57:43 +0100

On Fri, Feb 9, 2018 at 12:56 PM, Hubert Kowalski <h.kowalski@hakger.pl> wrote:

> If I may ask while we're at it: can we make some option to have GNU Parallel
> NOT buffer things and/or not output things? Some operations are heavy on I/O
> and not computationally intensive, so there's no need for either buffering
> or output from them :)

The reason why you cannot get that is described in
http://lists.gnu.org/archive/html/parallel/2018-01/msg00027.html

"""
You can see the reason for this design by imagining jobs that reads
very slowly: You will want all 5 of these to be running, but you would
have to read (and buffer) at least 4*5 GB to start the 5th process,
and the code is cleaner if you simply read the full block for every
process.
"""

In other words: You need to describe how this should work:

  program_generating_1GB_per_second |
    parallel -j5 --block 5G reader_that_reads_1Mbyte_per_second

Explain how will you provide input to the 5 readers. Note that the 1st
process should get the data block 0-5 GB from the generator, the 2nd
should get 5-10GB, the 3rd should get 10-15GB, the 4th 15-20GB, and
the 5th 20-25GB.

Within 25 seconds the generator can generate 25 GB. A single reader
can read 25 MB, and 5 readers can read 125 MB.

Explain what the system looks like after running for 25 seconds. How
much has been read from the generator? How much has been given to the
readers? Where is the difference stored? How do you ensure the 5th
reader gets the data block 20-25GB?


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]