parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Revision of GNU Parallel's processing of SIGTERM


From: Martin d'Anjou
Subject: Re: Revision of GNU Parallel's processing of SIGTERM
Date: Sun, 12 Apr 2015 19:53:03 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 15-04-12 07:14 AM, Ole Tange wrote:
On Sat, Apr 11, 2015 at 12:56 AM, Martin d'Anjou
<martin.danjou14@gmail.com> wrote:
Hello Ole,

I worked on the SIGTERM propagation feature today. I have questions, the
questions are also in the code in the form of comments, if you prefer to
read them there (search for "Question"):
https://github.com/martinda/gnu-parallel/compare/sigterm-1?expand=1#diff-5379ba718ef5b0a2feb45981e768a9fd

Q1:
Inside sub wait_and_exit, job->kill(TERM") is called twice. As I am trying
to update the documentation, I find this complex to explain.
Do you know why the call is made twice?
Should I write my own "wait_and_exit" for the SIGTERM propagation feature?
It think it is a left over from when $job->kill() did not send 2 TERMs.

The idea for this is if programs like GNU Parallel (that needs 2 TERMs
to exit) are started from GNU Parallel.

I understand now. Very clear. Another special program is emacs: I have read that SIGINT does not kill it! I have one other program like this, 3rd party binary unfortunately.

Q2:
I have added a [--wait-for-children [GRACE_PERIOD]] option for the user to
extend the grace period of $sleepsum in case the user is dealing with
processes that are long to "put to rest".
My question: should this option be available in general, or just for the
propagation feature?
Do we really need an option for this? I would like to see at least 2
real life scenarios, where this makes sense and for which a hard coded
value will not work.

I really do not like the current --wait-for-children solution that I proposed. After much thinking it is a bit too specific, and it does not fit well.

I have prepared the documentation for a different approach. I will send another email to keep things separate. This discussion is getting to be a lot of text.

In terms of a real life scenario, I can offer an overview of my workflow.

Some processes take a long time to terminate from the point of view of GNU Parallel, because from the time GNU Parallel issues the TERM signal and the time GNU Parallel hears back from the processes, there could be an amount of time longer than 200ms. For example, the current chain of command with SIGTERM in my workflow is: Jenkins, script, script, GNU Make, GNU Parallel, script, grid engine submission host, grid engine master, grid engine execution host, script, program. The last program is CPU/RAM/IO intensive, the layers above are for build management. When users hit the "kill the running job" button, SIGTERM has to make its way down to the low level program, the low level program does some work to properly terminate the process (could be a few seconds), and then it goes back up the chain. At each level, a little processing needs to happen to close that level properly. Each level along the way works better when its child process terminates in an orderly fashion. The delay between sending SIGTERM and hearing back from the child-most process can be more than 200ms.

I hope this demonstrates that in some cases, extending the grace period beyond 200ms benefits the user.


Q3:
Still in the wait_and_exit subroutine, the grace period is "ANDed" with the
family_pids[0].
Why just the 0'th element? Why not the entire array?
You mean in sub Job::kill():

             # Wait up to 200 ms between TERMs - but only if any pids
are alive
             my $sleep = 1;
             for (my $sleepsum = 0; kill 0, $family_pids[0] and $sleepsum < 200;
                  $sleepsum += $sleep) {
                 $sleep = ::reap_usleep($sleep);
             }

'kill 0, pid' returns true if the process is still running.
$family_pids[0] is the immediate child (i.e. the parent of any
(grand)*children)).
There is no need to see if any (grand)*children are running: it is the
job of $family_pids[0] to kill those.

Ok, I understand now. Yes this makes sense. I agree.

The for loop runs up to 200 ms, but if the pid dies earlier, then the
loop exits.

But maybe this should be revised:

When a job times out (--timeout) we want to kill it. It is OK to give
it 200 - 1000 ms to clean up, so 'kill TERM', wait, 'kill TERM', wait,
'kill KILL'.
When GNU Parallel receives 2 TERMs, it should for all jobs 'kill
TERM', wait, 'kill TERM', wait, 'kill KILL'.
The wait should always be an upper limit: Do not wait a full second,
if the job finishes faster.

I am not sure whether GNU Parallel should also kill the
(grand*)children, and if so how that should be done to work well for
most cases. Maybe:

'kill TERM', wait, 'kill TERM', wait, 'kill KILL', 'kill KILL
@grandchildren_pid'

This way the parent is given a chance to cleanup, but if it did not
manage, then GNU Parallel does the cleaning. It would be good to have
testcases for this kind of scenario.

The new tests I wrote are very close to this. They are on github for now:
https://github.com/martinda/gnu-parallel/blob/sigterm-1/testsuite/tests-to-run/parallel-local-signals.sh

I should be able to write one for this scenario if needed.

Thank you very much for your explanations, it helps a lot.

Martin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]