950121 – gluster doesn't like Oracle's FSINFO RPC call

Bug 950121 - gluster doesn't like Oracle's FSINFO RPC call

Summary: gluster doesn't like Oracle's FSINFO RPC call

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	nfs
Sub Component:
Version:	3.3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Niels de Vos
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-04-09 17:00 UTC by Michael Brown
Modified:	2014-04-17 11:41 UTC (History)
CC List:	3 users (show)
Fixed In Version:	glusterfs-3.5.0
Clone Of:
Clones:	955753 (view as bug list)
Environment:	2 × gluster servers: 2×E5-2670, 128GB RAM, RHEL 6.4 64-bit, glusterfs-server-3.3.1-1.el6.x86_64 (from EPEL) 4 × NFS clients: 2×E5-2660, 128GB RAM, RHEL 5.7 64-bit, glusterfs-3.3.1-11.el5 (from kkeithley's repo, only used for testing) bricks are 400GB SSDs with ext4 (and dir_index off) common network is 10GbE, replication between servers happens over direct 10GbE link. gluster> volume info gv0 Volume Name: gv0 Type: Distributed-Replicate Volume ID: 20117b48-7f88-4f16-9490-a0349afacf71 Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: fearless1:/export/bricks/500117310007a6d8/glusterdata Brick2: fearless2:/export/bricks/500117310007a674/glusterdata Brick3: fearless1:/export/bricks/500117310007a714/glusterdata Brick4: fearless2:/export/bricks/500117310007a684/glusterdata Brick5: fearless1:/export/bricks/500117310007a7dc/glusterdata Brick6: fearless2:/export/bricks/500117310007a694/glusterdata Brick7: fearless1:/export/bricks/500117310007a7e4/glusterdata Brick8: fearless2:/export/bricks/500117310007a720/glusterdata Brick9: fearless1:/export/bricks/500117310007a7ec/glusterdata Brick10: fearless2:/export/bricks/500117310007a74c/glusterdata Brick11: fearless1:/export/bricks/500117310007a838/glusterdata Brick12: fearless2:/export/bricks/500117310007a814/glusterdata Brick13: fearless1:/export/bricks/500117310007a850/glusterdata Brick14: fearless2:/export/bricks/500117310007a84c/glusterdata Brick15: fearless1:/export/bricks/500117310007a858/glusterdata Brick16: fearless2:/export/bricks/500117310007a8f8/glusterdata Options Reconfigured: diagnostics.count-fop-hits: on diagnostics.latency-measurement: on nfs.disable: off
Last Closed:	2014-04-17 11:41:35 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Packet capture of failed NFS FSINFO RPC (1010 bytes, application/vnd.tcpdump.pcap) 2013-04-09 17:00 UTC, Michael Brown	no flags	Details
Good FSINFO RPC from Linux (1.60 KB, application/vnd.tcpdump.pcap) 2013-04-09 17:01 UTC, Michael Brown	no flags	Details
Text summary of failed FSINFO RPC (1.38 KB, application/x-xz) 2013-04-09 17:02 UTC, Michael Brown	no flags	Details
Text summary of successful FSINFO RPC (1.66 KB, application/x-xz) 2013-04-09 17:02 UTC, Michael Brown	no flags	Details
Proposed patch for testing (1.53 KB, patch) 2013-04-12 22:53 UTC, Niels de Vos	no flags	Details \| Diff
Updated patch (1.50 KB, patch) 2013-04-13 13:11 UTC, Niels de Vos	no flags	Details \| Diff
Show Obsolete (1) View All

Description Michael Brown 2013-04-09 17:00:23 UTC

Created attachment 733285 [details]
Packet capture of failed NFS FSINFO RPC

I'm trying to get Oracle's DNFS working against gluster's internal NFS and I've run into a snag. After Oracle mounts the exported NFS filesystem the FSINFO call fails.

Let's look with wireshark:

»Remote Procedure Call, Type:Call XID:0x47349477
    Program: MOUNT (100005)
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
    Path: /gv0/fleming3/db0/ALTUS_data
»Remote Procedure Call, Type:Reply XID:0x47349477
    Reply State: accepted (0)
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
    Status: OK (0)
    fhandle
        length: 34
        [hash (CRC-32): 0x10650fe6]
        [Name: 192.168.10.1:/gv0/fleming3/db0/ALTUS_data]
        filehandle: 3a4f20117b487f884f169490a0349afacf71331965f573144e93b289a395148edfe5
»Remote Procedure Call, Type:Call XID:0x47349478
    Program: NFS (100003)
    Program Version: 3
    Procedure: FSINFO (19)
Network File System, FSINFO Call DH:0x10650fe6
    [Program Version: 3]
    [V3 Procedure: FSINFO (19)]
    object
        length: 34
        [hash (CRC-32): 0x10650fe6]
        [Name: 192.168.10.1:/gv0/fleming3/db0/ALTUS_data]
        filehandle: 3a4f20117b487f884f169490a0349afacf71331965f573144e93b289a395148edfe5
»Remote Procedure Call, Type:Reply XID:0x47349478
    Reply State: accepted (0)
    Accept State: procedure can't decode params (4)

ARGH. Not sure what's going on here - wireshark is perfectly happy to decode those params.

If I do a regular filesystem mount from Linux, the result is:

»Remote Procedure Call, Type:Call XID:0x266eda62
    Program: MOUNT (100005)
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
    Path: /gv0/fleming1/db0/ALTUS_data
»Remote Procedure Call, Type:Reply XID:0x266eda62
    Reply State: accepted (0)
Mount Service
    [Program Version: 3]
    [V3 Procedure: MNT (1)]
    Status: OK (0)
    fhandle
        length: 34
        [hash (CRC-32): 0xb2ae682f]
        [Name: 192.168.10.1:/gv0/fleming1/db0/ALTUS_data]
        filehandle: 3a4f20117b487f884f169490a0349afacf71e8bd0e2198c34cb88a0231dbeb071035
»Remote Procedure Call, Type:Call XID:0x68b3c756
    Program: NFS (100003)
    Procedure: FSINFO (19)
Network File System, FSINFO Call DH:0xb2ae682f
    [Program Version: 3]
    [V3 Procedure: FSINFO (19)]
    object
        length: 34
        [hash (CRC-32): 0xb2ae682f]
        [Name: 192.168.10.1:/gv0/fleming1/db0/ALTUS_data]
        filehandle: 3a4f20117b487f884f169490a0349afacf71e8bd0e2198c34cb88a0231dbeb071035
»Remote Procedure Call, Type:Reply XID:0x68b3c756
    Reply State: accepted (0)
Network File System, FSINFO Reply
    [Program Version: 3]
    [V3 Procedure: FSINFO (19)]
    Status: NFS3_OK (0)
    obj_attributes  Directory mode:0755 uid:500 gid:1000
    rtmax: 65536
    rtpref: 65536
    rtmult: 4096
    wtmax: 65536
    wtpref: 65536
    wtmult: 4096
    dtpref: 65536
    maxfilesize: 1125899906842624
    time delta: 1.000000000 seconds
    Properties: 0x0000001b

So for some reason, gluster is happy with Linux's request but not Oracle's.

All I get out of gluster is:
[2013-04-08 12:54:32.206312] E [nfs3.c:4741:nfs3svc_fsinfo] 0-nfs-nfsv3: Error decoding arguments

I've attached abridged packet captures and text explanations of the packets (thanks to wireshark).

Can someone please look at this and determine if it's gluster's parsing of the RPC call to blame, or if it's Oracle?

This is the same setup on which I reported the NFS race condition bug. It does have that patch applied.

Details: http://lists.gnu.org/archive/html/gluster-devel/2013-04/msg00014.html

Comment 1 Michael Brown 2013-04-09 17:01:33 UTC

Created attachment 733286 [details]
Good FSINFO RPC from Linux

Comment 2 Michael Brown 2013-04-09 17:02:17 UTC

Created attachment 733287 [details]
Text summary of failed FSINFO RPC

Comment 3 Michael Brown 2013-04-09 17:02:39 UTC

Created attachment 733288 [details]
Text summary of successful FSINFO RPC

Comment 4 Michael Brown 2013-04-11 16:40:30 UTC

Niels de Vos <ndevos> points out in http://lists.gnu.org/archive/html/gluster-devel/2013-04/msg00050.html:

«
XDR (http://tools.ietf.org/html/rfc4506, the encoding used for the RPC 
protocol) uses 'blocks' for alignment. A fhandle byte array that is 
34-bytes long, needs to be (34 / 4 + 1)*4 = 36 bytes in size. The 'length' 
given in the structure tells the consumer to ignore the two tailing 
bytes.

The NFSv3 specification (http://tools.ietf.org/html/rfc1813#page-21) 
defines the nfs_fh3 as a opaque (not bytes) structure.

My guess is that this (untested) change would fix it, can you try that?
»

It didn't :) Looks like Niels may have identified the problem, still need to fix it however.

Comment 5 Niels de Vos 2013-04-12 15:35:20 UTC

New proposal sent to Michael with gluster-devel@ on CC:

xdr_nfs_fh3 (XDR *xdrs, nfs_fh3 *objp)
{
        uint32_t size;

        if (!xdr_int (xdrs, &size))
                if (!xdr_opaque (xdrs, (u_int *)&objp->data.data_val, size))
                        return FALSE
        return TRUE;
}

Comment 6 Niels de Vos 2013-04-12 22:53:48 UTC

Created attachment 735043 [details]
Proposed patch for testing

23:51 < ndevos> Supermathie: ah, I've thought of the error in my
   suggestion - that function is used to encode and decode
23:52 < ndevos> which means, that the size parameter must be set
   correctly - the .data_len attribute contain the size when encoding,
   and should be overwritten when decoding
23:53 < ndevos> KERBOOM happens when an idea is only half looked at :-/

Maybe something the attached patch works better? It should encode/decode
both the length and the fhandle value. Compile tested only.

Comment 7 Niels de Vos 2013-04-13 13:11:14 UTC

Created attachment 735301 [details]
Updated patch

This patch does not break the Linux NFS client. I wonder if this makes it
possible to use the Oracle DNF client.

Comment 8 Michael Brown 2013-04-30 16:15:48 UTC

What happens when gluster accepts the bad RPC in the FSINFO handler is that things continue on, but that same bad XDR blocking keeps coming in and causes the glusterfs NFS daemon to crash.

Test cases need to be added to gluster to be more robust in handling this situation.

Regarding Oracle, I'm able to work around the problem by expanding the size of the FD so that it happens to be congruent to 0mod4 bytes: https://github.com/Supermathie/glusterfs/commit/95880cf71375cb4b04a1b645598c7570c5087de7

I'm morally opposed to submitting this for inclusion in Gluster however - Oracle needs to fix their code!

I'm inclined to leave this bug open as a request for better robustness in the handling of bad XDR encoding in incoming RPCs - they shouldn't be crashing Gluster's NFS.

Comment 9 Anand Avati 2013-04-30 19:16:28 UTC

REVIEW: http://review.gluster.org/4918 (Expand gluster's NFS FD header to 4 bytes) posted (#2) for review on master by Anand Avati (avati)

Comment 10 Anand Avati 2013-05-01 10:12:34 UTC

COMMIT: http://review.gluster.org/4918 committed in master by Anand Avati (avati) 
------
commit 39a1eaf38d64f66dfa74c6843dc9266f40dd4645
Author: Michael Brown <michael>
Date:   Tue Apr 30 11:34:57 2013 -0400

    Expand gluster's NFS FD header to 4 bytes
    
    * https://bugzilla.redhat.com/show_bug.cgi?id=950121
    * Oracle's DNFS does not properly XDR encoding on NFS FDs that
      are not congruent to 0mod4 bytes long
    * This patch is a workaround to support Oracle's buggy code
    
    Change-Id: Ic621e2cd679a86aa9a06ed9ca684925e1e0ec43f
    BUG: 950121
    Signed-off-by: Michael Brown <michael>
    Reviewed-on: http://review.gluster.org/4918
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Anand Avati <avati>

Comment 11 Niels de Vos 2014-04-17 11:41:35 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.5.0, please reopen this bug report.

glusterfs-3.5.0 has been announced on the Gluster Developers mailinglist [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/6137
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.