bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnulib] addition: wait-process.h, wait-process.c, 2nd round


From: Bruno Haible
Subject: Re: [Bug-gnulib] addition: wait-process.h, wait-process.c, 2nd round
Date: Thu, 16 Oct 2003 16:05:39 +0200
User-agent: KMail/1.5

Paul Eggert wrote:
> > typedef struct
> > {
> >   volatile sig_atomic_t used;
> >   volatile pid_t child;
> > }
> > slaves_entry_t;
>
> "slaves_entry_t" violates the namespace rules for applications, which
> aren't allowed to define anything ending in "_t" if they include the
> standard include files.  Perhaps it should just be called "slaves_entry".

My opinion on this is that
  - POSIX will never define a slaves_entry_t.
  - Legibility of the source is better if one can distinguish
    variables and types by a single glance, without learning the set of
    types by heart.
  - Automatic source code colouring in Emacs works better when types
    have a _t suffix.
and therefore proactively avoiding type names ending in _t does more
harm than good.

> I'm a bit dubious about the semantics, portability, and
> understandability of structs whose members are volatile.

It's the same as with const: If either the variable, or the struct typedef,
or the field inside the struct carry the const/volatile qualifier, then
const/volatile is in effect.

> I'd be more comfortable if the typedef didn't mention 'volatile'; instead,
> you can have the corresponding vars be 'volatile'.

I tried this earlier. It forced me to cast the pointers before calling
memcpy or free - because free() expects an unqualified 'void *' -, leading
to gcc warnings when some extra -W options are in effect.

> > static size_t slaves_allocated = SIZEOF (static_slaves);
>
> Hmm, why doesn't this need to be volatile?

Because it is not accessed from within the signal handler. So there is
no race condition about it.

> >       slaves_entry_t *new_slaves =
> >     malloc (new_slaves_allocated * sizeof (slaves_entry_t));
>
> That multiplication can overflow, which would cause serious memory buffer
> overruns later on.  You need to add something like this:
>
>     if (SIZE_MAX / sizeof (slaves_entry_t) < new_slaves_allocated)
>       xalloc_die ();

Can you put the array_size_overflow macro or inline function in a
public header file? xalloc.h or a different one, I don't mind.

> > /* Unregister a child from the list of slave subprocesses.  */
> > static inline void
> > unregister_slave_subprocess (pid_t child)
> > {
> >   /* The easiest way to remove an entry from a list that can be used by
> >      an asynchronous signal handler is just to mark it as unused.  For
> > this, we rely on sig_atomic_t.  */
> >   slaves_entry_t *s = slaves;
> >   slaves_entry_t *s_end = s + slaves_count;
> >
> >   for (; s < s_end; s++)
> >     if (s->used && s->child == child)
> >       s->used = 0;
> > }
>
> Isn't there a race here?  Just before 's->used = 0' is executed, a
> process could have died and another process could have been
> registered.  Or perhaps there are some constraints on when the
> functions can be called (e.g., you can't call these functions from a
> signal handler?).

unregister_slave_subprocess cannot be called from a signal handler,
because its only caller is wait_subprocess and calling wait_subprocess
from a signal handler would be crazy.

However there is a race condition indeed: Assume
  - A process dies while the parent is already listening in wait_subprocess(),
  - then the parent's waitpid() call returns and the subprocess is deleted
    from the list of zombie processes,
  - then - for some reason - the parent is stopped for a very long time,
  - while it is stopped, another unrelated process takes on the same PID,
  - then the parent continues execution and is soon caught by a SIGTERM,
    before it had a chance to complete the unregister_slave_subprocess
    function.
Then the parent's signal handler will kill the wrong process.

I think this scenario is
  - extremely rare, due to the time it requires until a PID is reused,
  - due to an inherent limitation of waitpid in POSIX: there is no way to
    ask waitpid: "Tell me when the process terminates but keep the process
    in zombie state afterwards."

I don't know how to avoid such a scenario.

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]