gluster-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gluster-devel] gluster doesn't like Oracle's FSINFO RPC call


From: Michael Brown
Subject: Re: [Gluster-devel] gluster doesn't like Oracle's FSINFO RPC call
Date: Fri, 12 Apr 2013 15:58:04 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

KERBOOM

address@hidden ~]$ sudo mount -a -t nfs
[sudo] password for michael:
mount: fearless1:/gv0 failed, reason given by server: No such file or directory
mount: fearless1:/gv0/fleming1/db0/ALTUS_config failed, reason given by server: unknown nfs status return value: 22
mount: fearless1:/gv0/fleming1/db0/ALTUS_data failed, reason given by server: unknown nfs status return value: 22
mount: fearless1:/gv0/fleming1/db0/ALTUS_flash failed, reason given by server: unknown nfs status return value: 22
mount.nfs: mount point /db/flash_recovery_area/ALTUS/onlinelog does not exist

nfs.log:
[2013-04-12 15:55:16.507084] E [nfs3.c:305:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c) [0x7f45bfbb852c] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29) [0x7f45bfbb2ce9] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51) [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl
[2013-04-12 15:55:16.538560] E [nfs3.c:4706:nfs3_fsinfo] 0-nfs-nfsv3: Bad Handle
[2013-04-12 15:55:16.538580] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: 242c1550, FSINFO: NFS: 10001(Illegal NFS file handle), POSIX: 14(Bad address)
[2013-04-12 15:55:16.538617] E [nfs3.c:305:__nfs3_get_volume_id] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo+0x22c) [0x7f45bfbb852c] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_fsinfo_reply+0x29) [0x7f45bfbb2ce9] (-->/usr/lib64/glusterfs/3.3.1/xlator/nfs/server.so(nfs3_request_xlator_deviceid+0x51) [0x7f45bfbb2481]))) 0-nfs-nfsv3: invalid argument: xl

(I tried both with and without modifying your uint32_t size to a 'int32_t size' to correct the signedness of the argument)

Get ahold of me in IRC and let's get this figured out. I've got a debugger attached.

M.

On 13-04-12 11:32 AM, Niels de Vos wrote:
On Fri, Apr 12, 2013 at 05:23:08PM +0200, Niels de Vos wrote:
On Thu, Apr 11, 2013 at 12:37:30PM -0400, Michael Brown wrote:
That actually broke everything (including Linux trying to mount NFS).

I've modified it slightly to be:

bool_t
xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
{
        if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *)
&objp->data.data_len, NFS3_FHSIZE))
                if (!xdr_opaque (xdrs, &objp, (u_int *)
&objp->data.data_len))
                        return FALSE;
        return TRUE;
}

(i.e. only call the xdr_opaque function if the xdr_bytes decode fails)
Nah, that won't work. The xdr_* functions are modifying the position of 
the cursor in the XDR-stream. Subsequent reads will continue where the 
previous one finished.

What you probably need to do is something like this:

xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
{
	uint32_t size;

	if (!xdr_int (xdrs, &size))
		if (!xdr_opaque (xdrs, (u_int *)&objp->data.data_len, size))
^ that should be objp->data.data_val of course :-/

			return FALSE
	return TRUE;
}

That will read the size of the fhandle first, to determine how long the opaque 
fhandle is, and use that size to read it.

Cheers,
Niels

But I get no change in behaviour.

Also get these warnings:

xdr-nfs3.c: In function 'xdr_nfs_fh3':
xdr-nfs3.c:197: warning: passing argument 2 of 'xdr_opaque' from
incompatible pointer type
/usr/include/rpc/xdr.h:313: note: expected 'caddr_t' but argument is of
type 'struct nfs_fh3 **'
xdr-nfs3.c:197: warning: passing argument 3 of 'xdr_opaque' makes
integer from pointer without a cast
/usr/include/rpc/xdr.h:313: note: expected 'u_int' but argument is of
type 'u_int *'

M.

On 13-04-11 07:42 AM, Niels de Vos wrote:
My guess is that this (untested) change would fix it, can you try that?

--- a/rpc/xdr/src/xdr-nfs3.c
+++ b/rpc/xdr/src/xdr-nfs3.c
@@ -184,7 +184,7 @@ xdr_specdata3 (XDR *xdrs, specdata3 *objp)
 bool_t
 xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
 {
-	 if (!xdr_bytes (xdrs, (char **)&objp->data.data_val, (u_int *) &objp->data.data_len, NFS3_FHSIZE))
+	 if (!xdr_opaque (xdrs, &objp, (u_int *) &objp->data.data_len))
 		 return FALSE;
 	return TRUE;
 }


HTH,
Niels

All I get out of gluster is:
[2013-04-08 12:54:32.206312] E [nfs3.c:4741:nfs3svc_fsinfo] 0-nfs-nfsv3:
Error decoding arguments


I've attached abridged packet captures and text explanations of the
packets (thanks to wireshark).

Can someone please look at this and determine if it's gluster's parsing
of the RPC call to blame, or if it's Oracle?

This is the same setup on which I reported the NFS race condition bug.
It does have that patch applied.
Details:
http://lists.gnu.org/archive/html/gluster-devel/2013-04/msg00014.html

Thanks,

Michael

-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
?: +1 519 883 1172 x5106    | termination of their C programs.' - Firth




_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel

          

-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth

-- 
Niels de Vos
Sr. Software Maintenance Engineer
Support Engineering Group
Red Hat Global Support Services

_______________________________________________
Gluster-devel mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/gluster-devel

    


-- 
Michael Brown               | `One of the main causes of the fall of
Systems Consultant          | the Roman Empire was that, lacking zero,
Net Direct Inc.             | they had no way to indicate successful
☎: +1 519 883 1172 x5106    | termination of their C programs.' - Firth

reply via email to

[Prev in Thread] Current Thread [Next in Thread]