Bug 851312

Summary: pNFS client fails to select correct DS from multipath
Product: Red Hat Enterprise Linux 6 Reporter: Tigran Mkrtchyan <tigran.mkrtchyan>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Jian Li <jiali>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.2CC: andros, jiali, nmurray, rwheeler, steved
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: pNFS
Fixed In Version: kernel-2.6.32-335.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 06:48:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910083    
Attachments:
Description Flags
possible fix
none
Patch the fixes the problem none

Description Tigran Mkrtchyan 2012-08-23 18:29:54 UTC
Created attachment 606683 [details]
possible fix

Description of problem:

The pNFS client implementation in RHEL 6.2 does not supports multipath for DS and simply picks the first one from the list. At the same time client does not supports IPv6 addresses as well. As a result, if a server reply with a multipath address where first one is IPv6 and second IPv4, then client will pick the first one and fail. It would be nice if client will try to take first supported one. 

Version-Release number of selected component (if applicable):
RHEL 6.2, kernel-2.6.32-220.23.1.el6.x86_64

How reproducible:
You will need a pNFS server which supports IPv6 and IPv4 as well as multipath.

Steps to Reproduce:
1. server have to return IPv6 and IPv4, where IPv6 entry is the first one
2.
3.
  
Actual results:
IO error with following error messages:
decode_device: Multipath count 2 not supported, skipping all greater than 1
decode_and_add_device: Could not decode or add device


Expected results:
Client picks the first supported address

Additional info:

A possible fix is attached. Please notice, that this is just a hint and not tested.

Comment 2 Ric Wheeler 2012-08-24 11:48:16 UTC
pNFS code in RHEL lags the upstream (but will get significant updates).

Have you seen this same issue with upstream (try Fedora 17 for example)?

Comment 3 Tigran Mkrtchyan 2012-08-24 12:27:50 UTC
No. Fedora 17 support IPv6 as well as multipath data servers.

Comment 4 RHEL Program Management 2012-10-02 18:11:29 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 5 Ric Wheeler 2012-10-03 05:42:33 UTC
Hi Tigran, it would be good to have you try out the updated pNFS code that is coming with RHEL6.4. We have a significant update in that release.

Comment 6 Tigran Mkrtchyan 2012-10-03 13:29:37 UTC
Hi Ric.

Sure, just give me a pointer to the rpm or srpm.

Comment 7 Steve Dickson 2012-10-03 19:12:28 UTC
Created attachment 621082 [details]
Patch the fixes the problem

This patch was tested at this fall's BAT...

Comment 8 Tigran Mkrtchyan 2012-10-03 20:08:31 UTC
I can confirm, that the patch fixes the problem I see.

Comment 10 Jian Li 2012-10-08 01:34:53 UTC
Try to test this bug with newpynfs(4.1), set qa_ack+ .

Comment 11 Jarod Wilson 2012-10-19 20:51:27 UTC
Patch(es) available on kernel-2.6.32-335.el6

Comment 14 Jian Li 2013-01-30 07:29:15 UTC
In 2.6.32-356, this patch still would not be tested, kernel still only support ipv4.

fs/nfs/nfs4filelayoutdev.c
273 
274     /* Currently only support ipv4 address */
275     if (in4_pton(buf, rlen, (u8 *)&ip_addr, '-', &ipend) == 0) {
276         dprintk("%s: Only ipv4 addresses supported\n", __func__);
277         goto out_free;
278     }


SanityOnly test. Please check that Steve, should we adapt kernel?

Comment 15 Ric Wheeler 2013-02-01 12:26:50 UTC
Do we need a separate BZ to track the IPV4 only issue?

Comment 16 Jian Li 2013-02-04 03:14:53 UTC
(In reply to comment #14)
> In 2.6.32-356, this patch still would not be tested, kernel still only
> support ipv4.
> 

This patch could be tested with newpynfs, server offer a multipath deviceinfo, the first is ipv6 style ([::1]), the second is ipv4 (127.0.0.1).

INFO   :nfs.server:****************************************
INFO   :nfs.server:Handling COMPOUND
INFO   :nfs.server:COMPOUND4args(tag='', minorversion=1, argarray=[nfs_argop4(argop=OP_SEQUENCE, opsequence=SEQUENCE4args(sa_sessionid='0000000000000001', sa_sequenceid=24, sa_slotid=0, sa_highest_slotid=0, sa_cachethis=False)), nfs_argop4(argop=OP_GETDEVICEINFO, opgetdeviceinfo=GETDEVICEINFO4args(gdia_device_id='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', gdia_layout_type=LAYOUT4_NFSV4_1_FILES, gdia_maxcount=16384, gdia_notify_types=0L))])
INFO   :nfs.server:*** OP_SEQUENCE (53) ***
INFO   :nfs.server:*** OP_GETDEVICEINFO (47) ***
INFO   :nfs.server:Replying.  Status NFS4_OK (0)
INFO   :nfs.server:COMPOUND4res(status=NFS4_OK, tag='', resarray=[nfs_resop4(resop=OP_SEQUENCE, opsequence=SEQUENCE4res(sr_status=NFS4_OK, sr_resok4=SEQUENCE4resok(sr_sessionid='0000000000000001', sr_sequenceid=24, sr_slotid=0, sr_highest_slotid=0, sr_target_highest_slotid=8, sr_status_flags=0))), nfs_resop4(resop=OP_GETDEVICEINFO, opgetdeviceinfo=GETDEVICEINFO4res(gdir_status=NFS4_OK, gdir_resok4=GETDEVICEINFO4resok(gdir_device_addr=device_addr4(da_layout_type=LAYOUT4_NFSV4_1_FILES, da_addr_body='\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x04tcp6\x00\x00\x00\t::1.48.58\x00\x00\x00\x00\x00\x00\x03tcp\x00\x00\x00\x00\x0f127.0.0.1.48.57\x00'), gdir_notification=0)))])


client mount with 
[root@hp-xw4600-01 ~]# mount [::1]:/files /mnt/test -o vers=4,minorversion=1
[root@hp-xw4600-01 ~]# grep "/mnt/test" /proc/mounts 
[::1]:/files/ /mnt/test nfs4 rw,relatime,vers=4,rsize=4096,wsize=4096,namlen=255,hard,proto=tcp6,port=0,timeo=600,retrans=2,sec=sys,clientaddr=::1,minorversion=1,local_lock=none,addr=::1 0 0
[root@hp-xw4600-01 ~]# cat /proc/fs/nfsfs/servers 
NV SERVER   PORT USE HOSTNAME
v4 7f000001 3039   1 (null)
v4 00000000000000000000000000000001  801   1 ::1

Comment 17 Jian Li 2013-02-04 03:24:12 UTC
(In reply to comment #15)
> Do we need a separate BZ to track the IPV4 only issue?

This patch aim to deal with IPV4 only issue, so this bug is finished.

Comment 19 errata-xmlrpc 2013-02-21 06:48:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0496.html