monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: monit restart bug/race condition (3.1 and current CVS version behav


From: Jan-Henrik Haukeland
Subject: Re: monit restart bug/race condition (3.1 and current CVS version behavior)
Date: 10 Feb 2003 11:38:22 +0100
User-agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Civil Service)

Martin Pala <address@hidden> writes:

[clever stuff]
> I tried it on simple test -  in this case monit without action mutex
> is less affected then monit with action mutex in check_process(),
> which failes every time. To solve the race condition, it will be
> needed to implement probably per-process mutex before do_validate flag
> test and cervlet check_process() calls. I did it too, but i found a
> deadlock in that case => it needs more work (last 'development' patch
> attached -
> 
> probably there could be better way to do it).

This looks okay!  The reason it does not work is that you will need to
initialize the mutex first :-) When the process is created in
p.y:createprocess() do a, pthread_mutex_init(&current->mutex, NULL);
and likewise in gc.c:gc_process() do a
pthread_mutex_destroy(&(*p)->mutex);

And the validate.c:do_not_validate() and util.is_process_running()
methods should (probably) use the same mutex, but the main reason to
use a mutex is that process's start/stop methods are synchronized
(i.e. serialized) per process to avoid stuff like this:

        (restart)
        thread1:                thread2

        process.stop            
                                is(!process.running) -> it's not running
                                 process.start
        process.start

With mutex

        (restart)
        thread1:                thread2

        LOCK
         process.stop            
         process.start
        END_LOCK
                                LOCK
                                 is(!process.running) -> it's running
                                  process.start
                                END_LOCK



static int do_not_validate(Process_T p) {
  
  int rv= TRUE;        

  ASSERT(p);

  LOCK(p->mutex)
        rv= (!p->do_validate  ||
             check_skip(p)    ||
             check_timeout(p) ||
             check_checksum(p));
  END_LOCK;

  return rv;

}

> I must go to work tomorrow morning, so i will continue either
> tomorrow evening (or you can try it :)

I'm knee-deep in my damn zervlet project :-( It's about to get
released.. so if you could give it a stab it's appreciated!

> Btw. what about to release 3.2 after race condition fix? There were
> few bugs in 3.1, so probably we can make bugfix release 3.2 (plus
> few extensions which are already made for it) and schedule device
> tests for 3.3, which i hope will be ready soon :)

Good idea I'll second that +1

-- 
Jan-Henrik Haukeland




reply via email to

[Prev in Thread] Current Thread [Next in Thread]