l4-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reliability of RPC services


From: Marcus Brinkmann
Subject: Re: Reliability of RPC services
Date: Tue, 25 Apr 2006 17:07:43 +0200
User-agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.7 (Sanjō) APEL/10.6 Emacs/21.4 (i486-pc-linux-gnu) MULE/5.0 (SAKAKI)

At Tue, 25 Apr 2006 10:07:03 -0400,
"Jonathan S. Shapiro" <address@hidden> wrote:
> A caution about "send-exactly-once": there is no such thing. One of the
> things that we should try to preserve is the possibility of extending
> capabilities across a network. It is well known that (1)
> "send-exactly-once" cannot be implemented across a network, and (2) if a
> watchdog terminates a connection, there is a fundamental race: the
> server will not know that the session is gone until it tries to reply,
> which may be after it completes the operation. All of this is true
> because of network partitions.

Yeah, that's odd.  However, it is still send-exactly-once: It's just
that the message gets transformed in the process.  The remote server
sends a proper reply, but what actually would be delivered is the
local timeout message.  That's a bit like a chocolate santa claus in
an easter-bunny wrapping.
 
> *Because* we want to preserve this possibility, I think that this is
> also the correct baseline architecture for local failures. If we
> introduce a cancellation mechanism, we must understand that cancellation
> is best-effort, and not guaranteed.

Nasty stuff.  It's getting better and better: If you have a chain of N
invocations, and K points of failures, you are really screwed, because
then there are at least K points you have to "nudge" in some way to
get everything propagated.  And there is a good chance that such
things can happen in real practice because of systematic correlations
between failure modes.  Ouch!

Thanks,
Marcus





reply via email to

[Prev in Thread] Current Thread [Next in Thread]