parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Limiting memory used by parallel?


From: Ole Tange
Subject: Re: Limiting memory used by parallel?
Date: Tue, 30 Jan 2018 17:52:47 +0100

On Mon, Jan 29, 2018 at 6:19 PM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> On Sun, Jan 28, 2018 at 02:45:42AM +0100, Ole Tange wrote:
>> On Thu, Jan 25, 2018 at 4:33 PM, hubert depesz lubaczewski
>> You can also use --cat:
>>
>>   tar cf - /some/directory | parallel -j 5 --pipe --block 5G --cat
>> --recend '' 'cat {} | ./handle-single-part.sh {#}'
>>
>> This way each block is saved to the tempdir before the job starts. By
>> my limited testing this should make GNU Parallel only keep 1-2 blocks
>> in memory.
>
> So, I did try it.
> To make it as simple as possible, I made source of data:
> dd if=/dev/zero bs=8k count=13107200

Are you sure your tar command can deliver data at that speed
sustained? If not, then you are not doing a real test.

The above command is based on tar _not_ delivering data faster than
than saving the temp file to the tempdisk.

Typically the tmp-filesystem will be at least as fast as any other
file system, but on many systems /tmp is faster than other filesystems
on the server.

If you really will not use tar to generate the input, then at least
make sure 'dd' only delivers data at the speed that 'tar' would have
(eg. by using 'pv').


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]