l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliability of RPC services


From: Jonathan S. Shapiro
Subject: Re: Reliability of RPC services
Date: Tue, 25 Apr 2006 12:30:59 -0400

On Tue, 2006-04-25 at 17:47 +0200, Michal Suchanek wrote:
> On 4/25/06, Jonathan S. Shapiro <address@hidden> wrote:
> > On Tue, 2006-04-25 at 11:54 +0200, Michal Suchanek wrote:
> >
> > > ad (b) Imagine a few  scenarios:
> > > ...
> > > And I do not think that timeouts or watchdogs solve [these] on 
> > > non-realtime system.
> >
> > I agree. However, this mis-states the issue. You are talking about what
> > happens when you have already decided to recover (e.g. by killing a
> > non-performing renderer). The purpose of the timeout is to help
> > determine when recovery is required.
> >
> > Also, in each of the examples that you gave, an asynchronous interface
> > is appropriate. Recovering on an asynchronous interface is relatively
> > straightforward.
> >
> 
> So you say that the timeouts and watch dogs actually solve a different
> kind of problem.

No. Watchdogs and timeouts are the same thing. You were talking about
cases where a user says "kill that rendering agent, because it is
misbehaving."

> 
> The send-once + reference-counted capabilities serve to notify when a
> service has already failed. This allows the client to restart the
> action or use different means for obtaiing the service. Or just free
> any resources associated with the failed service in case of a proxy.
> 
> But the watchdog is used to identify a service that is slow to respond
> and may be the one that is failing so that the user may remove it and
> trigger the recovery.

Notice that the first is subsumed by the second. The only question is to
decide what latency is acceptable before noticing that a server has been
destroyed. This will determine whether a timeout is sufficient.

shap





reply via email to

[Prev in Thread] Current Thread [Next in Thread]