gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: revlib locking


From: Jan Hudec
Subject: Re: [Gnu-arch-users] Re: [BUG] FEATURE PLANS: revlib locking
Date: Fri, 4 Jun 2004 12:47:23 +0200
User-agent: Mutt/1.5.6i

On Fri, Jun 04, 2004 at 11:01:01 +0100, Andrew Suffield wrote:
> On Fri, Jun 04, 2004 at 11:31:48AM +0200, Jan Hudec wrote:
> > On Thu, Jun 03, 2004 at 23:31:38 +0100, Andrew Suffield wrote:
> > > FSVO "reliable". It's non-atomic - it is not the case that precisely
> > > one process will get the lock. Specifically, it's possible that no
> > > processes will get the lock. If you get two clients trying to claim
> > > the lock at once, they can get into a mutual busy loop, repeatedly
> > 
> > Would you care to explain how can neither process get the lock (provied
> > they all have appropriate permissions)? AFAICT the link operation is
> > atomic on the server.
> 
> I don't believe that's generally true, although the linux NFS server

Only atomicity I want is atomicity of the underlaying syscall!

Now the only way to break atomicity is, that the kernel would
temporarily create the entry even if it's going to return failure and
remove it again. I don't think any kernel would do such useless work.
I also think, that atomicity of link(2) is part of POSIX.

Now the only way for the NFS server to break this would be to do the
same useless work -- create the link, then check if it should do it and
delete it again if it shouldnt. In addition to being a bug, it's SO much
extra work (when you can just do the syscall and see), that I don't
think any NFS server breaks this.

However, I don't know about every single brain-damaged implementation of
NFS, so it's possible that there is some such broken.

> probably provides it. In general, "NFS never makes _any_ atomicity or
> ordering guarantees". You also have to deal with the fact that the
> majority of NFS implementations currently deployed are just plain
> broken.
> [...]
> > > trying and failing, so you have to rely on random wait states to break
> > > this loop. This can be quite impressively slow.
> > > 
> > > It also requires manual intervention to break stale locks safely.
> > > 
> > > Neither flock() nor fcntl(), when they work, have these issues.
> > 
> > Any locking has problems with stale locks.
> 
> Well, flock() obviously doesn't.

Because it does not work for any network filesystems.

> fcntl() also works better than you
> suspect because...

It does work correctly in the local case. It does have the stale-lock
problem for the network case for the reason, that locking over network
can't be fully fault-tolerant for principial reasons.

> > And no resolution is perfect.
> > It simply can't be, because you never know for sure whether the client
> > crashed, or just it's connection died.
> 
> ...actually, you do. That's what NSM gives you, as supported by
> rpc.statd, which is a required component for fcntl() locks over
> NFS. It's based on the assumption that clients will get rebooted
> reasonably quickly.

... which may work in few more cases, but does not work generaly.

Over network, you should always do transactions. Unlike locking, they
are sound.

-------------------------------------------------------------------------------
                                                 Jan 'Bulb' Hudec 
<address@hidden>

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]