parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Your opinion needed: Should GNU Parallel ignore, kill or wait for ba


From: Ole Tange
Subject: Re: Your opinion needed: Should GNU Parallel ignore, kill or wait for background children?
Date: Tue, 24 May 2016 22:06:37 +0200

On Tue, May 24, 2016 at 3:55 AM, Martin d'Anjou
<martin.danjou14@gmail.com> wrote:

> On 16-05-23 06:46 PM, Ole Tange wrote:
>>
>> Should GNU Parallel ignore, kill or wait for background children?
>>
>> Example:
>>
>> $ parallel '(sleep 100) & echo' ::: 1
:
> Do not leak processes. In the simple case, it is as bad as leaking memory,
> and worst, the leaked process may monopolize external resources like files,
> or a database, etc.

The leak will only happen if the user starts a detached background job
as above. There will be no leak if you start a simple process or if
you wait for your own children:

$ parallel '(sleep 100) & echo {}; wait' ::: no leak

So the question more or less boils down to: Should GNU Parallel do the
wait or should we assume that if the user wanted the wait, he would
put it in himself (which is how it works now)?

> I expect GNU Parallel to wait until everything it has started also has
> completed,

GNU Parallel will wait for everything it started directly. The
question is here whether it should also wait for something the user
started inside this.

GNU Parallel can test if the process group is still running (by kill
-0 the pgrp). This will also work if the process groups starts a
detached job in the background. But if the detached job creates a new
process group, it will be out of GNU Parallel's reach.

Compare this section to the section below:

$ parallel  "(perl -e 'sleep(100)')& echo \$\$" ::: no_new_pgrp
941274 no_new_pgrp
$ ps -o pid,pgrp,cmd
    PID    PGRP CMD
 940626  940626 /bin/bash
 941111  941111 /usr/bin/emacs -nw Makefile
 941275  941274 perl -e sleep(100)
 941276  941276 ps -o pid,pgrp,cmd

GNU Parallel knows of PGRP 941274, so all is OK, and GNU Parallel can
wait for PGRP to finish or even kill it (which it already does if
--timeout is exceeded). Currently GNU Parallel does not wait for the
PGRP before considering the job finished, but that can probably be
changed.

$ parallel  "(perl -e 'setpgrp;sleep(100)')& echo \$\$" ::: new_pgrp
941286 new_pgrp
$ ps -o pid,pgrp,cmd
    PID    PGRP CMD
 940626  940626 /bin/bash
 941111  941111 /usr/bin/emacs -nw Makefile
 941287  941287 perl -e setpgrp;sleep(100)
 941288  941288 ps -o pid,pgrp,cmd

GNU Parallel knows of PGRP 941286 and does not know that 941287
spawned from 941286, so GNU Parallel will never be able to wait for
941287.

> and I also expect GNU Parallel to return a non-zero exit code if
> one of the processes it launched returned a non-zero.

What does this return? True? False? Frue? Talse?

  ( false )  & true

GNU Parallel will see this as exit code 0. And I see no way of getting
hold of the exit code of the command in the ().

> If you don't want to wait for GNU Parallel, then backgroud GNU Parallel
> itself, and hold on to that PID if you want, but do not let children leak to
> the OS by default.
>
> GNU Make, which has a parallel feature, does not leak processes. GNU
> Parallel should not leak either.

That, unfortunately, is not true. Try this Makefile:

all:
        bash -c "(sleep 100) &"

make -j will finish immediately - leaking the sleep process. This is
again because the user did not wait for the child process:

all:
        bash -c "(sleep 100) & wait"

which has no leak.

- o -

The more I discuss this with you the more I am starting to be
convinced that it is the user's job to do the wait - not GNU Parallel.
There are a few practical reasons:

* If the user does not want GNU Parallel to wait there will have to be
an option to disable this (which is likely to be used so rarely that
no one will remember it).
* It requires changing of the code to support the waiting, whereas if
it is the user's responsibility then the code is fine as it is. It
even cleans up nicely by killing the children if the process is killed
by --timeout.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]