Bug 143897

Summary: [EMC/Symantec RHEL 4.5 FEAT] NFS can't handle minor number greater than 255
Product: Red Hat Enterprise Linux 4 Reporter: Neena Bhatnagar <neena>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: high    
Version: 4.0CC: agud, bmarson, jbaron, kearnan_keith, linux26port, mmahut, ram_pandiri, sprabhu, steve.overy, tao, urs
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0304 Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-01 22:44:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 176344, 195232, 196060, 198694, 198868, 200231, 228868    
Attachments:
Description Flags
Proposed RHEL4 Patch
none
debug patch 1
none
ethereal tracedumps
none
fix 1 none

Description Neena Bhatnagar 2004-12-31 20:04:14 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; rv:1.7.3)
Gecko/20041001 Firefox/0.10.1

Description of problem:

 All scsi drivers at present are using up to  255 minor
 numbers.

 The reason is NFS is still not able to understand the larger minor
 numbers. Once a volume with smaller minor number is created, then it
 works fine.
 Please do let us know if you need any more information.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
 1. Create a volume,
 2. create filesystem, and mount it.
 3. Export the filesystem.
 4. On other machine try to mount that filesystem. The mount
    command will hang ...
    

Additional info:

Comment 3 Steve Dickson 2005-11-21 11:22:16 UTC
How big does the volume have to be for this problem to occur?

Comment 4 kiran mehta 2006-02-02 17:07:25 UTC
Even a 100M Volume with minor number > 255 does not get mounted on client.

Comment 6 Bryan Mason 2006-02-06 21:41:49 UTC
To perhaps clarify, what is being reported as an issue is that if an NFS server
exports a filesystem that resides on a device that has a minor number > 255,
then the mount command on the client will hang.  It doesn't make a difference
how large the volume is, as long as the device's minor number is greater than 255.

Comment 8 Steve Dickson 2006-05-04 10:49:33 UTC
*** Bug 177517 has been marked as a duplicate of this bug. ***

Comment 12 Andrius Benokraitis 2006-05-24 15:43:09 UTC
*** Bug 192874 has been marked as a duplicate of this bug. ***

Comment 16 Steve Dickson 2006-06-15 12:05:58 UTC
How exactly are you getting minor number > 255? It it the case that
I need to create 256 disk partitions?


Comment 17 kiran mehta 2006-06-15 13:26:45 UTC
Veritas Volume manager allows creation of volumes with minor number greater 
than 255. For raw disks, i guess, the only way is to create those many 
partitions as you suggested.


Comment 19 Andrius Benokraitis 2006-06-16 15:29:32 UTC
Installing EMC PowerPath will also generate > 255 devices.

Comment 20 Keiichi Mori 2006-06-19 13:04:56 UTC
Here's the patch in upstream...
http://sourceforge.net/mailarchive/message.php?msg_id=17219025
(Subject: Problems with fh_fsid_type=3 exports (device minor id > 255))


Comment 22 Steve Dickson 2006-06-19 13:49:35 UTC
YI...

Installing EMCpower.LINUX-4.3.0 on a 2.6.9-36.ELsmp kernel
I got the following errors... 

rpm -ihv EMCpower.LINUX-4.3.0-166.rpm
Preparing...                ########################################### [100%]
All trademarks used herein are the property of their respective owners.
   1:EMCpower.LINUX         ########################################### [100%]
This linux kernel configuration is not supported by PowerPath..
error: %post(EMCpower.LINUX-4.3.0-166.i386) scriptlet failed, exit status 1

which caused things not to be installed or setup correctly...


I'll continue to pursue the Veritas way... 


Comment 23 Keith Kearnan 2006-06-19 13:54:02 UTC
Hi Steve,
PowerPath 4.3.0 is meant for the RHEL3 (2.4 kernel).  PowerPath 4.5.1 will 
install on RHEL4u2 & RHEL4u3.
-Keith

Comment 24 Steve Dickson 2006-06-19 14:04:53 UTC
Created attachment 131134 [details]
Proposed RHEL4 Patch

Ok I'm game... Here is the RHEL4 patch that incorporates the upstream
maintainer's suggestion.... Unfortunately I'm having set up issues so could
you please test the attach patch and if successful, I will propose
the it asap... If it would be better for us to supply a kernel RPM
containing the patch, please let us know...

Comment 25 Keith Kearnan 2006-06-19 14:12:57 UTC
Sure, will do.

Comment 26 Andrius Benokraitis 2006-06-19 15:09:30 UTC
FYI... a guide for PowerPath versions mapping to RHEL:

4.3.4 --> RHEL 3 U7
4.3.3/4.3.4 --> RHEL 3 U6
4.3.3/4.3.4 --> RHEL 3 U5
4.3.2/4.3.3 --> RHEL 3 U4 

4.5.1 --> RHEL 4 U3
4.5.0 --> RHEL 4 U2
4.4.1 --> RHEL 4 U1

Comment 27 Steve Dickson 2006-06-20 00:45:55 UTC
Test kernels that containt the patch from comment #24
are available in http://people.redhat.com/steved/bz143897/

Comment 31 Steve Dickson 2006-06-24 13:11:46 UTC
bummer....

Looking at the at the tcpdumps in one of the numerous ITs it appears
the FSINFO request is not returning... To verify this please set some
nfsd debugging by doing
    ' echo > /var/log/messages; echo 18 >  /proc/sys/sunrpc/nfsd_debug'
and then posting what is in /var/log/messages...

Comment 35 Keith Kearnan 2006-07-05 19:16:50 UTC
Hi Steve,
I tested the kernel (2.6.9-39.1.bz143897.EL) and I still get a hang.

[root@L7 ~]# uname -a
Linux L7 2.6.9-39.1.bz143897.EL #1 Mon Jun 19 16:04:16 EDT 2006 x86_64 x86_64 
x86_64 GNU/Linux
[root@L7 ~]# echo > /var/log/messages; echo 18 >  /proc/sys/sunrpc/nfsd_debug

[root@L7 ~]# tail -100 /var/log/messages

Jul  5 11:06:04 L7 kernel: NFSD: laundromat service - starting, examining 
clients
Jul  5 11:06:04 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds
Jul  5 11:06:40 L7 sshd(pam_unix)[29231]: session opened for user root by root
(uid=0)
Jul  5 11:07:34 L7 kernel: NFSD: laundromat service - starting, examining 
clients
Jul  5 11:07:34 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds
Jul  5 11:09:04 L7 kernel: NFSD: laundromat service - starting, examining 
clients
Jul  5 11:09:04 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds

-Keith

Comment 36 Keith Kearnan 2006-07-06 12:28:33 UTC
Please ignore my previous post.  I forgot that the patch was in the server.  I 
need to redo it.  Sorry. 

Comment 37 Barry Marson 2006-07-07 17:17:17 UTC
Installed RHEL4-U4 6/7 tree i386 and Storage Foundation 4.1 on
veritas2.rhts.boston.redhat.com test system.  Created two disk groups, each with
one volume.  One of the volumes has a minor # of 128 the other 512.  Created
vxfs filesystems on both and mounted them.  

Commands on server:

  # vxdg init lowMinor_dg sdc minor=128
  # vxassist make -g lowMinor_dg lowMinor 1g
  # mkfs -t vxfs /dev/vx/dsk/lowMinor_dg/lowMinor
  # mount -t vxfs /dev/vx/dsk/lowMinor_dg/lowMinor /lowLUNfs

  # vxdg init hiMinor_dg sdd minor=512
  # vxassist make -g hiMinor_dg hiMinor 2g
  # mkfs -t vxfs /dev/vx/dsk/hiMinor_dg/hiMinor
  # mount -t vxfs /dev/vx/dsk/hiMinor_dg/hiMinor /hiLUNfs

From client:

  # mount -t nfs veritas2:/lowLUNfs /lo
  # mount -t nfs veritas2:/hiLUNfs /hi

The low minor # filesystem worked fine, the high minor # filesystem hung when
attempting to mount.  the messages file states that both filesystem mount
requests were authenticated:


Jul  7 12:19:48 veritas2 kernel: nfsd: last server has exited
Jul  7 12:19:48 veritas2 kernel: nfsd: unexporting all filesystems
Jul  7 12:19:48 veritas2 nfs: nfsd shutdown succeeded
Jul  7 12:19:48 veritas2 nfs: rpc.rquotad shutdown succeeded
Jul  7 12:19:48 veritas2 nfs: Shutting down NFS services:  succeeded
Jul  7 12:19:48 veritas2 nfs: Starting NFS services:  succeeded
Jul  7 12:19:48 veritas2 nfs: rpc.rquotad startup succeeded
Jul  7 12:19:48 veritas2 nfsd[16798]: nfssvc_versbits: +2 +3 +4
Jul  7 12:19:48 veritas2 nfs: rpc.nfsd startup succeeded
Jul  7 12:19:48 veritas2 nfs: rpc.mountd startup succeeded
Jul  7 12:19:48 veritas2 rpcidmapd: rpc.idmapd -SIGHUP succeeded
Jul  7 12:19:59 veritas2 mountd[16810]: authenticated mount request from
ubrew.boston.redhat.com:926 for /lowLUNfs (/lowLUNfs)
Jul  7 12:20:07 veritas2 mountd[16810]: authenticated mount request from
ubrew.boston.redhat.com:930 for /hiLUNfs (/hiLUNfs)



Comment 38 Amit Gud 2006-07-09 13:30:51 UTC
Server surely doens't seem to be able to send out any reply to the FSINFO
command, and onwards.

This is for a volume with minor number 128

[root@veritas2 logs]# grep nfsd_dispatch messages-good7
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 19
nfsd_dispatch: vers 3 proc 1

[root@veritas2 logs]# grep sendto messages-good7
svc: socket f6b8fc80 sendto([c40b8000 28... ], 28) = 28 (addr 385310ac)
svc: socket f6b8fc80 sendto([d7647000 28... ], 28) = 28 (addr 385310ac)
svc: socket f6b8fc80 sendto([c3057000 28... ], 28) = 28 (addr 385310ac)
svc: socket f6b8fc80 sendto([d00ab000 84... ], 84) = 84 (addr 385310ac)
svc: socket f6b8fc80 sendto([c3057000 116... ], 116) = 116 (addr 385310ac)

Whereas for minor number 512:

[root@veritas2 logs]# grep nfsd_dispatch messages-bad8
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 0
nfsd_dispatch: vers 3 proc 19
nfsd_dispatch: vers 3 proc 19
nfsd_dispatch: vers 3 proc 1
nfsd_dispatch: vers 3 proc 1

[root@veritas2 logs]# grep sendto messages-bad8
svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac)
svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac)
svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac)

I suspect the problem is in cache flags, and thats the reason upcall is not made
from cache_check()

-------
for reference:

proc 0 - NULL / ping
proc 19 - FSINFO
proc 1 - GETATTR


Comment 39 Amit Gud 2006-07-11 17:30:16 UTC
Created attachment 132256 [details]
debug patch 1

Apparently the upcall is being sent but the response is never received.

With the attached patch, for minor 512 (and similar for minor 126 with a change
of fsid_type as 3 instead of 0):

---
nfsd_dispatch: vers 3 proc 19
nfsd: FSINFO(3)   12: 00030001 0020c700 00000002 00000000 00000000 00000000
nfsd: fh_verify(12: 00030001 0020c700 00000002 00000000 00000000 00000000)
Jul 11 12:44:52 veritas2 kernel: RPC: svc_expkey_lookup: set=0, INPLACE=0, item
flags=3225511531
NFSD: svc_expkey_init: fsid_type=3, fsid[0]=2148096, fsid[1]=2,
fsid[2]=4125508608, client=f5387940
NFS: AG: exp_find_key: fsid_type=3, valid=0, negative=0, pending=0, flags=0
RPC: cache_check: flags=0 exp_time=1152636408 last_ref=1152636288
Want update, refage=120, age=0
NFSD: cache_make_upcall
NFSD: expkey_request: fsid_type=3, fsid[0]=2148096, fsid[1]=2,
fsid[2]=4125508608, client=f5387940
NFSD: � *bpp=3 type=<NULL>
NFSD: cache_make_upcall: len=4073
---

The response should come from svc_export_parse, i.e. when
/proc/net/rpc/nfsd.export is written to. But it is not. So the nfsd code so far
looks good, need to find out why this place is not written to.

Comment 40 Amit Gud 2006-07-12 00:42:26 UTC
Created attachment 132274 [details]
ethereal tracedumps

Network trace shows some intersting bits. During mount reply, mountd doesn't
return back a correct file handle. (or similar to the file handle for minor
number < 256).

Attached are tracedumps:
traces/trace-bad1-upcall  - minor number 512
traces/trace-good-upcall  - minor number 128

Comment 41 Amit Gud 2006-07-12 16:40:06 UTC
Some endian-ness or boolean operation issue, I guess. This is the reply by
mountd, to the MNT call:

For minor 126 -

01 00 00 00   00 c7   00 80   02 00 00 00
-----------   -----   -----   -----------
handle type   major   minor     inode#
(knfsd new)   (199)   (126)      (2)

where as for minor 512 - 

01 00 03 00   00 c7   20 00   02 00 00 00
-----------   -----   -----   -----------
handle type   major   minor     inode#
   (?)        (199)    (?)       (2)


file handle type should be same - 01 00 00 00
and minor number should be 512 - 02 00

Some data mishandling, either in mountd or in fh_compose.


Comment 42 Amit Gud 2006-07-12 17:38:13 UTC
Modifications in interpreting the reply message (only relevant fields shown):

01 00 00 00   00 c7   00 80   02 00 00 00
--    --      -----   -----   -----------
A     B       major   minor     inode#
              (199)   (126)      (2)

A: version = 1
B: fsid_type = 0, which says there are next 8 bytes which are major number (2
bytes), minor number (2 bytes) and inode numer (4 bytes)

where as for minor 512 - 

01 00 03 00   00 c7   20 00   02 00 00 00
--    --      -------------   -----------
A     B       device number     inode#
                                 (2)

A: version = 1
B: fsid_type = 3, which says there are next 8 bytes which are encoded as: 4 byte
device number and 4 bytes inode number.

Now, device number is encoded using new_dev_encode, which turns out to be
2148096 decimal, i.e. 20C700 hex. I don't know in what byte order it is
represented above.


Comment 43 Amit Gud 2006-07-12 20:08:53 UTC
Created attachment 132330 [details]
fix 1

This patch makes mounting possible and things work normally, but this is not a
fix, its just a work-around.

The problem is, for minor devices greater than 256, cache key is generated with
a different technique (new_ecnode_dev), and not everyone is aware of that it
seems. I strongly suspect nfs-utils is the culprit and doesn't handle the case
well.

Attached patches are against nfs-utils-1.0.9 and 2.6.9-40.ELsmp.

Comment 44 Amit Gud 2006-07-14 13:45:54 UTC
Pity me. The fix in the comment 24 is the right fix. I should've tested it
before rediscovering the fix all the way :| But anyways, good learning
experience :).

I tested the machine with nfs-utils-1.0.9 on 2.6.9-40.ELsmp.


Comment 45 Andrius Benokraitis 2006-07-14 15:28:17 UTC
Amit, will this get propogated into RHEL 5? EMC tested RHEL 5 alpha 1 and they
stated they could see this issue in that version as well. This is a userspace
patch, and not kernel, correct? Should this component be verified?

Comment 46 Amit Gud 2006-07-14 17:32:14 UTC
This is a kernel patch. I just now checked the kernel devel cvs branch
(2.6.18-rc1-git8), and it looks its fixed. But, I'm not sure if this is RHEL5
alpha 1 kernel.


Comment 47 Andrius Benokraitis 2006-07-14 17:46:19 UTC
This is great news, thanks for the update! The RHEL 5 alpha was only 2.6.17-1,
so that explains it. This is looking good for RHEL 5, as the betas will be later
versions, -18 based.

Comment 48 Keiichi Mori 2006-07-18 01:12:24 UTC
Please let me make clear about this issue against RHEL4.

The patch in comment#24 is the right fix for kernel.
Also, we need to backport something from nfs-utils-1.0.9, right ?



Comment 49 Steve Dickson 2006-07-20 11:53:02 UTC
No... The patch in Comment #24 is a kernel patch not a nfs-utils patch... 

Comment 50 Amit Gud 2006-07-20 13:07:06 UTC
We don't need to backport anything from nfs-utils-1.0.9. Although I tested with
1.0.9, earlier versions should also work right as its not the cause of the bug.

Comment 61 Jason Baron 2006-09-15 19:16:28 UTC
committed in stream U5 build 42.11. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 62 Andrius Benokraitis 2006-10-04 17:31:58 UTC
From Symantec:

Hi, I picked up this - " kernel-2.6.9-42.15.EL.i686.rpm" from
http://people.redhat.com/~jbaron/rhel4/

But the issue still remains :-

On server :-
=========

[root@vcslinux35 2.6.9-42.15.EL-i686]# uname -a
Linux vcslinux35.vxindia.veritas.com 2.6.9-42.15.EL #1 Fri Sep 29 18:08:58 EDT
2006 i686 i686 i386 GNU/Linux


[root@vcslinux35 2.6.9-42.15.EL-i686]# ls -l /dev/vx/dsk/fendg/vol
brw-------  1 root root 199, 65534 Oct  4 16:04 /dev/vx/dsk/fendg/vol


[root@vcslinux35 2.6.9-42.15.EL-i686]# mount
/dev/sda1 on / type ext3 (rw)
none on /proc type proc (rw)
none on /sys type sysfs (rw)
none on /dev/pts type devpts (rw,gid=5,mode=620)
usbfs on /proc/bus/usb type usbfs (rw)
none on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
tmpfs on /dev/vx type tmpfs (rw,size=4k,nr_inodes=2097152,mode=0755)
/dev/vx/dsk/fendg/vol on /mnt1 type ext3 (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)

On client :-
=========

[root@vcslinux119 ~]# showmount -e vcslinux35
Export list for vcslinux35:
/mnt1 *

[root@vcslinux119 ~]# mount vcslinux35:/mnt1 /mnt1

hangs forever...

Please let me know if you need any more info

Comment 65 Rita Sequeira 2006-11-17 10:58:20 UTC
Any update on this one please. We are still waiting on a fix. If you need more 
info please let us know

Rita

Comment 66 Andrius Benokraitis 2006-11-17 14:02:35 UTC
Rita, we are currently working on RHEL 5, and will start prioritizing 4.5 issues
soon. Many partners have requested this fix, but we will need your help in
testing when needed.

Comment 73 Rita Sequeira 2007-04-11 12:28:44 UTC
Fixed and tested on RHEL4 U5 snapshot 5. 

Comment 77 Red Hat Bugzilla 2007-05-01 22:44:30 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html


Comment 85 Jeff Layton 2007-07-20 10:51:14 UTC
*** Bug 174385 has been marked as a duplicate of this bug. ***

Comment 88 Sachin Prabhu 2009-09-07 13:13:52 UTC
A Note for anyone else who stumbles on this bz. 

The corresponding nfs-utils patch was handled in bz 228868. 
The fix requires version nfs-utils-1.0.6-78 on the server.