[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Async processes started in functions not reliably started
From: |
Robert Elz |
Subject: |
Re: Async processes started in functions not reliably started |
Date: |
Tue, 06 Aug 2019 05:49:36 +0700 |
Date: Mon, 05 Aug 2019 14:05:43 +0200
From: Steffen Nurpmeso <steffen@sdaoden.eu>
Message-ID: <20190805120543.Bf9-U%steffen@sdaoden.eu>
| Would be nice to have some shell support for signalling the parent
| that the child is now functional,
The shell cannot really know - your example was not functional until
after it set up the traps.
But the shell code knows, something like the following might work
(untested, not even given off to bash to check syntax, and uses of $?
would need to be sanitised (value saved) with just what is here it is
OK, but real code to replace the comments would probably need to use it
again)
In the parent:
OK=false
T=$(trap -p USR2) # only needed if USR2 might be trapped
already
trap 'OK=true' USR2
run_the_child &
if ! $OK && wait $!
then
echo "Child failed to initialise properly! >&2
# and whatever else you want to do
elif $OK
then
: # here the child is running, and ready
else
echo "Failure: $? from child" >& 2
# either the child did exit N (N != 0) in which
# case $? will tell us why it failed, or some
# stray signal was delivered (and caught) by the
# current shell ... deal with those possibilities
fi
case "$T" in # if T= was needed above
'') trap - USR2;; # bash would have said nothing if trap was
default
*) eval "$T" ;; # for other shells which do, or if USR2 was trapped.
esac
# continue with parent code, now knowing that child has init'd itself
In the child:
trap 'whatever' SIG_I_NEED
# any other init that is needed
kill -s USR2 $$ # or if the parent pid is not $$, use whatever is.
# do whatever the child is supposed to do
The wait is to pause the parent - an exit 0 from it should not happen,
and indicates that the child did exit 0 which it is not supposed to do
at this point. The ! $OK test before the wait is in case the child
started very quickly, and the signal already arrived. There is still
a race condition here (having the child sleep for a brief interval as
part of its init would help reduce the probability of problems from that).
Pity the shell has no way to allow scripts to block signals (ie: sigblock).
If the wait is interrupted by a signal, (or if the USR2 signal happened
earlier and we skip the wait) and it was USR2 (from the child) then OK
will become true, and the child is ready to continue. If the wait
exits for some other reason, then perhaps some other signal was delivered,
and caught, and did not exit the shell) - if that's possible the wait should
be in a loop (ie: while :; do if wait ...) and this case should cause the
loop to iterate, whereas all the other possibilities end in break, or the
child did exit N indicating that some failure happened before it init'd
itself.
No temp files, named pipes, or othe similar stateful mechanisms needed.
What's more, aside from the "trap -p" which is probably not going to be
needed (the script writer knows no other USR2 trap is already set) all of
this is POSIX code (even the trap -p will be in the next version).
kre
ps: the function in the example is badly named, to "reap" is to harvest
or collect, what the function given that name is actually doing is
killing other processes (the original parent collects them, not that
child) - a better name would be assassin than reaper (it isn't even the
"Grim Reaper").