gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: revlib locking


From: Tom Lord
Subject: Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: revlib locking
Date: Sun, 6 Jun 2004 12:30:16 -0700 (PDT)

    > From: address@hidden (Julian T. J. Midgley)

    > >On Sat, May 29, 2004 at 02:14:39PM +0100, Andrew Suffield wrote:
    >>> But if you want to support almost any other combinations of
    >>> anything then you're screwed.

    >>Oh, and hint: it is not without good reason that expansions such as
    >>"Network Failure Service", "No File is Safe", became popular.

    > I understood that 

    >  1. Create nonce file 
    >  2. hardlink lock file to it
    >  3. check reference count of nonce file 

    > Was a reliable way of locking files across NFS.  Is this not the case?


You could do just as well with only one system call:

        1. Rename lock file to nonce file.  If that
           succeeds, you own the lock (at least at the
           instant that the rename completes).

Unfortunately, that only gets you an advisory lock -- nothing enforces
the lock.

Worse, that only gets you a _useless_ advisory lock -- nothing _can_
enforce the lock.

Because locks are represented persistently, and because processes that
make them have independent lifetimes, it is always possible that the
lock will outlive your process.

That's fine.  No problem -- but it does mean you need a way to be able
to _break_ a lock whose process has died.

It's impossible, in the general case, to be certain that a process
you've heard about has actually died.   So, your decision to break a
lock might be a mistaken one.

What is the upshot of all of that?   To actually be robust, your
process will not only have to aquire a lock, it will have to be 
able to handle the situation where somebody else breaks its lock
while your process is still running.

Yet with link counts or renamed lock files -- your process can not
ever be certain that its lock hasn't been broken.   It can check --
but the possible outcomes are:

        1) it _was_ broken by the time you checked

        2) it wasn't broken by the time you checked but it might be
           by now

(2) isn't enough justification to continue mutating the data
structures the lock is supposed to protect.

This kind of problem is why arch archive have such a strange locking
protocol involving a couple of nested directories and lots of fancy
directory renaming.

The arch protocol is designed in such a way that if somebody else
breaks my lock but my process doesn't notice, the worst that happens
is that my process continues doing work for a little while in a
scratch directory that will eventually be deleted.   In other words, 
the twisted directory tricks in archive locking managage to extract
out of NFS semantics a kind of locking that _is_ robust -- that is
enforced -- that will stop your process from doing anything if I break
its lock.

Unfortuantely, archive-protocol locks are not all that general.
They're good for what arch does (making a list of revisions, each of
which never changes after its in place) but aren't easily adapted to
more varied kinds of file system operations.

-t






reply via email to

[Prev in Thread] Current Thread [Next in Thread]