gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] bug with TLA 313?


From: Anand Avati
Subject: Re: [Gluster-devel] bug with TLA 313?
Date: Sat, 21 Jul 2007 02:22:29 +0530

Brent,
there was a bug in setxattr, of the length getting calculated by -1 for
(non ascii) binary values of setxattr. can you please check if your cp goes
through now? I'm very sorry I am unable to test this ourselves since we dont
have a system which uses posix acls, though xattrs are now working fine on
binary data (before the fix it was working only for pure ascii data only)

thanks,
avati

2007/7/20, Brent A Nelson <address@hidden>:

Nope, it's still there.  Example strace snippet:

setxattr("/beast/glusterfs/beast", "system.posix_acl_access",

"\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x04\x00\xff\xff\xff\xff
\x00\x04\x00\xff\xff\xff\xff", 28, 0) = -1 EINVAL (Invalid argument)

It presumably should have returned EOPNOTSUPP (Operation not supported),
instead.

Thanks,

Brent

On Fri, 20 Jul 2007, Anand Avati wrote:

> Brent,
> there was a fix in fuse_setxattr in patch-325. please check if it fixes
> your issue. AFR was only reporting the errno's passing via it.
>
> thanks,
> avati
>
> 2007/7/20, Brent A Nelson <address@hidden>:
>>
>> I should point out that this was with the full (AFR/unify) setup, not
the
>> stripped-down setup.  I also get a lot of messages such as the
following
>> in /var/log/glusterfs/glusterfs.log:
>> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror4: (path=/usr0
>> child=share4-0) op_ret=-1 op_errno=22
>> 2007-07-19 15:19:28 E [afr.c:514:afr_setxattr_cbk] mirror0: (path=/usr0
>> child=share0-0) op_ret=-1 op_errno=22
>> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:17 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
>> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror6:
>> (path=/nfs/share/locale/cs child=share6-0) op_ret=-1 op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>> 2007-07-19 15:57:24 E [afr.c:575:afr_getxattr_cbk] mirror7:
>> (path=/nfs/share/locale/cs/LC_TIME child=share7-1) op_ret=-1
op_errno=61
>>
>> Thanks,
>>
>> Brent
>>
>> On Thu, 19 Jul 2007, Brent A Nelson wrote:
>>
>> > Patch 322 seems to have fixed the stray ls errors, but not the cp -a
>> > complaints.  A "cp -a" strace is attached.
>> >
>> > Thanks,
>> >
>> > Brent
>> >
>> > On Wed, 18 Jul 2007, Brent A Nelson wrote:
>> >
>> >> Aha, it looks like GlusterFS is giving odd/varying error responses
to
>> >> queries for ACL information (I assume it should be giving an
"operation
>> not
>> >> supported" error).  This must be related to my previously reported
>> problem
>> >> copying from GlusterFS to GlusterFS where it was complaining about
>> >> preserving ACLs for every file copied.
>> >>
>> >> See attached strace.
>> >>
>> >> Thanks,
>> >>
>> >> Brent
>> >>
>> >> PS At least in this simple case where glusterfs is directly mounting
a
>> >> storage/posix, NFS reexport works fine. I haven't had a chance to
test
>> a
>> >> full setup with recent GlusterFS tlas, but I will once the ACL
glitch
>> is
>> >> squashed.
>> >>
>> >> On Wed, 18 Jul 2007, Anand Avati wrote:
>> >>
>> >>> Brent,
>> >>> very interesting diagnosis! is it possible for you to re-create the
>> 'posix
>> >>> only' setup (no server/client) and again do 'strace ls -ial /beast'
?
>> we
>> >>> are
>> >>> not able to reproduce this error at our setup.
>> >>>
>> >>> thanks
>> >>> avati
>> >>>
>> >>> 2007/7/17, Brent A Nelson <address@hidden>:
>> >>>>
>> >>>> Just a quick note that this doesn't seem to be any sort of
corruption
>> >>>> issue.  I completely emptied all my shares (even removing
lost+found)
>> and
>> >>>> my namespace and rsynced the corresponding AFR shares and
>> namespace.  The
>> >>>> only thing different between the AFRs would be ctimes.
>> >>>>
>> >>>> I restarted everything, and did:
>> >>>> ls -al /beast
>> >>>> ls: /beast: File exists
>> >>>> ls: /beast/.: File exists
>> >>>> total 8
>> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
>> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
>> >>>>
>> >>>> I also tried disabling readahead and writebehind (my only
performance
>> >>>> translators).  It didn't help.  Changing the unify from alu to rr
>> also
>> >>>> didn't help.
>> >>>>
>> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n mirror0 /beast"
to
>> >>>> mount a single AFR, no unify.  It STILL produces the same
messages.
>> >>>>
>> >>>> I then tried "glusterfs -f /etc/glusterfs/beast -n share0-0
/beast"
>> to
>> >>>> mount a simple, single share used as half of an AFR.  Same issue.
>> >>>>
>> >>>> I then stripped down a server to serve out one single
storage/posix
>> >>>> share,
>> >>>> with no posix locks (I wasn't using any other translators on the
>> server
>> >>>> side, apart from protocol/server, of course).  I mounted that
share
>> as in
>> >>>> the previous attempt.  No difference!
>> >>>>
>> >>>> So, this issue occurs even with just protocol/client,
>> protocol/server,
>> >>>> and
>> >>>> storage/posix in use.  As barebones as you can get.  Almost.
>> >>>>
>> >>>> One more try.  No glusterfsd, and glusterfs accesses a single
>> >>>> storage/posix directly:
>> >>>>
>> >>>> ls -al /beast
>> >>>> ls: /beast: File exists
>> >>>> ls: /beast/.: File exists
>> >>>> total 8
>> >>>> drwxr-xr-x  2 root root 4096 2007-07-17 09:27 .
>> >>>> drwxr-xr-x 27 root root 4096 2007-07-02 10:18 ..
>> >>>>
>> >>>> No difference, even with just glusterfs directly accessing a
single,
>> >>>> local
>> >>>> storage/posix, with no other translators.  Spec is simply:
>> >>>>
>> >>>> volume share0
>> >>>>    type storage/posix                   # POSIX FS translator
>> >>>>    option directory /share0             # Export this directory
>> >>>> end-volume
>> >>>>
>> >>>> Ubuntu Feisty, Fuse 2.6.3.
>> >>>>
>> >>>> Any ideas?
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Brent
>> >>>>
>> >>>>
>> >>>> On Sat, 14 Jul 2007, Brent A Nelson wrote:
>> >>>>
>> >>>> > It's the same spec I was using previously (AFRed namespace
cache,
>> >>>> unified
>> >>>> > AFRs spread across four servers, posix-locks, readahead, and
>> >>>> writebehind).
>> >>>> > It's not just the top-level directory; it's everywhere.
>> >>>> >
>> >>>> > Thanks,
>> >>>> >
>> >>>> > Brent
>> >>>> >
>> >>>> > On Sat, 14 Jul 2007, Anand Avati wrote:
>> >>>> >
>> >>>> >> Brent,
>> >>>> >> this is strange, we are having patch-313 work pretty smooth so
>> far.
>> >>>> are
>> >>>> >> there any changes in your spec? is this behaviour seen only in
>> this
>> >>>> >> particular directory or 'anywhere' in general? please attach
your
>> spec
>> >>>> so
>> >>>> >> that we can try to reproduce it in our labs.
>> >>>> >>
>> >>>> >> thanks,
>> >>>> >> avati
>> >>>> >>
>> >>>> >> 2007/7/14, Brent A Nelson <address@hidden>:
>> >>>> >>>
>> >>>> >>> Updating to the latest TLA patch, I got odd issues just with
>> "ls":
>> >>>> >>>
>> >>>> >>> Example:
>> >>>> >>>
>> >>>> >>> ls -al /beast/
>> >>>> >>> ls: /beast/: No such file or directory
>> >>>> >>> ls: /beast/.: No such file or directory
>> >>>> >>> ls: /beast/lost+found: No such file or directory
>> >>>> >>> ls: /beast/usr0: No such file or directory
>> >>>> >>> ls: /beast/usr: No such file or directory
>> >>>> >>> total 32
>> >>>> >>> drwxr-xr-x  5 root root  4096 2007-07-13 16:18 .
>> >>>> >>> drwxr-xr-x 27 root root  4096 2007-06-25 18:34 ..
>> >>>> >>> drwx------  2 root root 16384 2007-06-25 17:08 lost+found
>> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr
>> >>>> >>> drwxr-xr-x 10 root root  4096 2007-06-18 13:31 usr0
>> >>>> >>>
>> >>>> >>> I have one machine that is no longer returning from an
"ls".  I
>> get
>> >>>> other
>> >>>> >>> messages sometimes, not just "No such file or directory", but
>> also
>> >>>> "Bad
>> >>>> >>> file descriptor" or even "File exists".  These extraneous
>> messages
>> >>>> are
>> >>>> >>> also occurring when copying from the GlusterFS to the
>> GlusterFS.  The
>> >>>> >>> files and directories mentioned do, in fact, exist, no matter
>> what
>> >>>> the
>> >>>> >>> extraneous error message says.
>> >>>> >>>
>> >>>> >>> Is there a known issue with the current patchset?
>> >>>> >>>
>> >>>> >>> Thanks,
>> >>>> >>>
>> >>>> >>> Brent
>> >>>> >>>
>> >>>> >>>
>> >>>> >>> _______________________________________________
>> >>>> >>> Gluster-devel mailing list
>> >>>> >>> address@hidden
>> >>>> >>> http://lists.nongnu.org/mailman/listinfo/gluster-devel
>> >>>> >>>
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>> >> --
>> >>>> >> Anand V. Avati
>> >>>> >>
>> >>>> >
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Anand V. Avati
>> >
>>
>
>
>
> --
> Anand V. Avati
>




--
Anand V. Avati


reply via email to

[Prev in Thread] Current Thread [Next in Thread]