monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: additional feature for monit-3.0 (for clusters)


From: Martin Pala
Subject: Re: additional feature for monit-3.0 (for clusters)
Date: Mon, 28 Oct 2002 14:14:28 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020913 Debian/1.1-1

Agree - some suggestions?


Oliver Jehle wrote:

No problems by inserting a new config statement...

but it should be a xor with autostart... you cannot have autostart=true and
monitor only manual started :

so i think, giving the autostart a better name would be easier !!!




Martin Pala wrote:

Yet one thing - i think that it shouldn't be part of 'autostart'
statement (it is intended for other task). Maybe new statement such as
'automonitor [yes|no]' (it sounds strange - maybe someone will had
better hint :) will be more clear, if such functionality will be added
to monit.

Martin

Martin Pala wrote:

Yeah, not bad idea :)

there are two ways to reach similar feature:

1.) check process only when started under monit's control as described
Oliver - it is very simple and effective method, every cluster node needs
only one 'local' monit instance.

2.) have monit instance failover with service as part of resource group - in
such case it must be installed on shared disks and when cluster
reconfiguration is initialized it will start monit process with resource as
well => there should be one monit instance per resource (or more accurately
per shared disk group on which SCSI reservation is applied). This method
doesn't require big monit modification - the only one needed will be to have
option for specification of monit's pid file location somewhere in the
filesystem (it should be on shared disk group). Resource failover is
transparent for monit - it needn't care about shared environment, it will
just start/stop itself and monitor/start services => cluster health and
shared storage must be monitored/maintained by other service (as for example
by mentioned heartbeat).

First (Oliver's) method is similar to object registration (as in SUN
cluster's pmfadm for example) - it will allow with this extension build
simple clusters. There's yet another question - storage maintanance, two
ways i think about:

a.) described rcscripts (monit-node1 and monit-node2, etc.) will be
responsible for storage maintenance (storage reserve/release and optionaly
forcing of it). They shouldn't start 'monit -g service start' before the
node masters the storage => it may lead to hard error before touching monit
subsystem (similar as in above mentioned variant 2.)

b.) start/stop scripts involved by monit will be more sophisticated and will
check/maintain storage status (possible do scsi reservation in the case that
the node doesn't master it) before trying to start service. While monit
currently doesn't watch for return value of these scripts, in the case of
failure it will lead to service timeout on monit's level.


It is possible to allow one or both methods (variant 1. needs Oliver's
patch, variant 2. needs optional pid file location patch).

+1 for Oliver's way

Maybe it will be usefull for others to have 'howto' for building simple
clusters with use of monit :)

Greetings,
Martin


----- Original Message -----
From: "Jan-Henrik Haukeland" <address@hidden>
To: <address@hidden>
Sent: Wednesday, October 23, 2002 7:47 AM
Subject: Re: additional feature for monit-3.0 (for clusters)



I spoke with Oliver off list and asked him to send a mail to the list
for discussion, so does anyone have an opinion on this?

Oliver Jehle <address@hidden> writes:



when using heartbeat and groups in monit, i've missed following feature.

monit should only monitor manualy started resources . and after stopping
it, monit should stop monitor it.

so i've implemented a third input value for the autostart "started". now
monit monitors a resource only, if you start it with "monit start"...

why that... see below.... it's my config for hearbeat with monit

on every node

/etc/inittab starts monit
/etc/rc3.d/ script execute "monit start heartbeat"
/etc/init.d/ monit-node1 "monit -g node1 start"
/etc/init.d/ monit-node2 "monit -g node2 start"

so hearbeat can control easy the cluster state and if one node fails,
hearbeat starts monit-xxxx of the failing node and  monit is instructed
to start the services of the failing-node and monitor them...


--
Oliver Jehle
Monex AG
Föhrenweg 18
FL-9496 Balzers

Tel: +423 388 1988
Fax: +423 388 1980

----
I've not lost my mind. It's backed up on tape somewhere.
----



--
Jan-Henrik Haukeland


_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev



_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev


_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev



_______________________________________________
monit-dev mailing list
address@hidden
http://mail.nongnu.org/mailman/listinfo/monit-dev







reply via email to

[Prev in Thread] Current Thread [Next in Thread]