[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: monit restart bug/race condition (3.1 and current CVS version behav
From: |
Jan-Henrik Haukeland |
Subject: |
Re: monit restart bug/race condition (3.1 and current CVS version behavior) |
Date: |
10 Feb 2003 11:38:22 +0100 |
User-agent: |
Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Civil Service) |
Martin Pala <address@hidden> writes:
[clever stuff]
> I tried it on simple test - in this case monit without action mutex
> is less affected then monit with action mutex in check_process(),
> which failes every time. To solve the race condition, it will be
> needed to implement probably per-process mutex before do_validate flag
> test and cervlet check_process() calls. I did it too, but i found a
> deadlock in that case => it needs more work (last 'development' patch
> attached -
>
> probably there could be better way to do it).
This looks okay! The reason it does not work is that you will need to
initialize the mutex first :-) When the process is created in
p.y:createprocess() do a, pthread_mutex_init(¤t->mutex, NULL);
and likewise in gc.c:gc_process() do a
pthread_mutex_destroy(&(*p)->mutex);
And the validate.c:do_not_validate() and util.is_process_running()
methods should (probably) use the same mutex, but the main reason to
use a mutex is that process's start/stop methods are synchronized
(i.e. serialized) per process to avoid stuff like this:
(restart)
thread1: thread2
process.stop
is(!process.running) -> it's not running
process.start
process.start
With mutex
(restart)
thread1: thread2
LOCK
process.stop
process.start
END_LOCK
LOCK
is(!process.running) -> it's running
process.start
END_LOCK
static int do_not_validate(Process_T p) {
int rv= TRUE;
ASSERT(p);
LOCK(p->mutex)
rv= (!p->do_validate ||
check_skip(p) ||
check_timeout(p) ||
check_checksum(p));
END_LOCK;
return rv;
}
> I must go to work tomorrow morning, so i will continue either
> tomorrow evening (or you can try it :)
I'm knee-deep in my damn zervlet project :-( It's about to get
released.. so if you could give it a stab it's appreciated!
> Btw. what about to release 3.2 after race condition fix? There were
> few bugs in 3.1, so probably we can make bugfix release 3.2 (plus
> few extensions which are already made for it) and schedule device
> tests for 3.3, which i hope will be ready soon :)
Good idea I'll second that +1
--
Jan-Henrik Haukeland