Bug 143897
Summary: | [EMC/Symantec RHEL 4.5 FEAT] NFS can't handle minor number greater than 255 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Neena Bhatnagar <neena> | ||||||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||||||
Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||||
Severity: | high | Docs Contact: | |||||||||||
Priority: | high | ||||||||||||
Version: | 4.0 | CC: | agud, bmarson, jbaron, kearnan_keith, linux26port, mmahut, ram_pandiri, sprabhu, steve.overy, tao, urs | ||||||||||
Target Milestone: | --- | Keywords: | FutureFeature | ||||||||||
Target Release: | --- | ||||||||||||
Hardware: | All | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | RHBA-2007-0304 | Doc Type: | Enhancement | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2007-05-01 22:44:30 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 176344, 195232, 196060, 198694, 198868, 200231, 228868 | ||||||||||||
Attachments: |
|
Description
Neena Bhatnagar
2004-12-31 20:04:14 UTC
How big does the volume have to be for this problem to occur? Even a 100M Volume with minor number > 255 does not get mounted on client. To perhaps clarify, what is being reported as an issue is that if an NFS server exports a filesystem that resides on a device that has a minor number > 255, then the mount command on the client will hang. It doesn't make a difference how large the volume is, as long as the device's minor number is greater than 255. *** Bug 177517 has been marked as a duplicate of this bug. *** *** Bug 192874 has been marked as a duplicate of this bug. *** How exactly are you getting minor number > 255? It it the case that I need to create 256 disk partitions? Veritas Volume manager allows creation of volumes with minor number greater than 255. For raw disks, i guess, the only way is to create those many partitions as you suggested. Installing EMC PowerPath will also generate > 255 devices. Here's the patch in upstream... http://sourceforge.net/mailarchive/message.php?msg_id=17219025 (Subject: Problems with fh_fsid_type=3 exports (device minor id > 255)) YI... Installing EMCpower.LINUX-4.3.0 on a 2.6.9-36.ELsmp kernel I got the following errors... rpm -ihv EMCpower.LINUX-4.3.0-166.rpm Preparing... ########################################### [100%] All trademarks used herein are the property of their respective owners. 1:EMCpower.LINUX ########################################### [100%] This linux kernel configuration is not supported by PowerPath.. error: %post(EMCpower.LINUX-4.3.0-166.i386) scriptlet failed, exit status 1 which caused things not to be installed or setup correctly... I'll continue to pursue the Veritas way... Hi Steve, PowerPath 4.3.0 is meant for the RHEL3 (2.4 kernel). PowerPath 4.5.1 will install on RHEL4u2 & RHEL4u3. -Keith Created attachment 131134 [details]
Proposed RHEL4 Patch
Ok I'm game... Here is the RHEL4 patch that incorporates the upstream
maintainer's suggestion.... Unfortunately I'm having set up issues so could
you please test the attach patch and if successful, I will propose
the it asap... If it would be better for us to supply a kernel RPM
containing the patch, please let us know...
Sure, will do. FYI... a guide for PowerPath versions mapping to RHEL: 4.3.4 --> RHEL 3 U7 4.3.3/4.3.4 --> RHEL 3 U6 4.3.3/4.3.4 --> RHEL 3 U5 4.3.2/4.3.3 --> RHEL 3 U4 4.5.1 --> RHEL 4 U3 4.5.0 --> RHEL 4 U2 4.4.1 --> RHEL 4 U1 Test kernels that containt the patch from comment #24 are available in http://people.redhat.com/steved/bz143897/ bummer.... Looking at the at the tcpdumps in one of the numerous ITs it appears the FSINFO request is not returning... To verify this please set some nfsd debugging by doing ' echo > /var/log/messages; echo 18 > /proc/sys/sunrpc/nfsd_debug' and then posting what is in /var/log/messages... Hi Steve, I tested the kernel (2.6.9-39.1.bz143897.EL) and I still get a hang. [root@L7 ~]# uname -a Linux L7 2.6.9-39.1.bz143897.EL #1 Mon Jun 19 16:04:16 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux [root@L7 ~]# echo > /var/log/messages; echo 18 > /proc/sys/sunrpc/nfsd_debug [root@L7 ~]# tail -100 /var/log/messages Jul 5 11:06:04 L7 kernel: NFSD: laundromat service - starting, examining clients Jul 5 11:06:04 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds Jul 5 11:06:40 L7 sshd(pam_unix)[29231]: session opened for user root by root (uid=0) Jul 5 11:07:34 L7 kernel: NFSD: laundromat service - starting, examining clients Jul 5 11:07:34 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds Jul 5 11:09:04 L7 kernel: NFSD: laundromat service - starting, examining clients Jul 5 11:09:04 L7 kernel: NFSD: laundromat_main - sleeping for 90 seconds -Keith Please ignore my previous post. I forgot that the patch was in the server. I need to redo it. Sorry. Installed RHEL4-U4 6/7 tree i386 and Storage Foundation 4.1 on veritas2.rhts.boston.redhat.com test system. Created two disk groups, each with one volume. One of the volumes has a minor # of 128 the other 512. Created vxfs filesystems on both and mounted them. Commands on server: # vxdg init lowMinor_dg sdc minor=128 # vxassist make -g lowMinor_dg lowMinor 1g # mkfs -t vxfs /dev/vx/dsk/lowMinor_dg/lowMinor # mount -t vxfs /dev/vx/dsk/lowMinor_dg/lowMinor /lowLUNfs # vxdg init hiMinor_dg sdd minor=512 # vxassist make -g hiMinor_dg hiMinor 2g # mkfs -t vxfs /dev/vx/dsk/hiMinor_dg/hiMinor # mount -t vxfs /dev/vx/dsk/hiMinor_dg/hiMinor /hiLUNfs From client: # mount -t nfs veritas2:/lowLUNfs /lo # mount -t nfs veritas2:/hiLUNfs /hi The low minor # filesystem worked fine, the high minor # filesystem hung when attempting to mount. the messages file states that both filesystem mount requests were authenticated: Jul 7 12:19:48 veritas2 kernel: nfsd: last server has exited Jul 7 12:19:48 veritas2 kernel: nfsd: unexporting all filesystems Jul 7 12:19:48 veritas2 nfs: nfsd shutdown succeeded Jul 7 12:19:48 veritas2 nfs: rpc.rquotad shutdown succeeded Jul 7 12:19:48 veritas2 nfs: Shutting down NFS services: succeeded Jul 7 12:19:48 veritas2 nfs: Starting NFS services: succeeded Jul 7 12:19:48 veritas2 nfs: rpc.rquotad startup succeeded Jul 7 12:19:48 veritas2 nfsd[16798]: nfssvc_versbits: +2 +3 +4 Jul 7 12:19:48 veritas2 nfs: rpc.nfsd startup succeeded Jul 7 12:19:48 veritas2 nfs: rpc.mountd startup succeeded Jul 7 12:19:48 veritas2 rpcidmapd: rpc.idmapd -SIGHUP succeeded Jul 7 12:19:59 veritas2 mountd[16810]: authenticated mount request from ubrew.boston.redhat.com:926 for /lowLUNfs (/lowLUNfs) Jul 7 12:20:07 veritas2 mountd[16810]: authenticated mount request from ubrew.boston.redhat.com:930 for /hiLUNfs (/hiLUNfs) Server surely doens't seem to be able to send out any reply to the FSINFO command, and onwards. This is for a volume with minor number 128 [root@veritas2 logs]# grep nfsd_dispatch messages-good7 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 19 nfsd_dispatch: vers 3 proc 1 [root@veritas2 logs]# grep sendto messages-good7 svc: socket f6b8fc80 sendto([c40b8000 28... ], 28) = 28 (addr 385310ac) svc: socket f6b8fc80 sendto([d7647000 28... ], 28) = 28 (addr 385310ac) svc: socket f6b8fc80 sendto([c3057000 28... ], 28) = 28 (addr 385310ac) svc: socket f6b8fc80 sendto([d00ab000 84... ], 84) = 84 (addr 385310ac) svc: socket f6b8fc80 sendto([c3057000 116... ], 116) = 116 (addr 385310ac) Whereas for minor number 512: [root@veritas2 logs]# grep nfsd_dispatch messages-bad8 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 0 nfsd_dispatch: vers 3 proc 19 nfsd_dispatch: vers 3 proc 19 nfsd_dispatch: vers 3 proc 1 nfsd_dispatch: vers 3 proc 1 [root@veritas2 logs]# grep sendto messages-bad8 svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac) svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac) svc: socket f7131f00 sendto([c2e8a000 28... ], 28) = 28 (addr 385310ac) I suspect the problem is in cache flags, and thats the reason upcall is not made from cache_check() ------- for reference: proc 0 - NULL / ping proc 19 - FSINFO proc 1 - GETATTR Created attachment 132256 [details]
debug patch 1
Apparently the upcall is being sent but the response is never received.
With the attached patch, for minor 512 (and similar for minor 126 with a change
of fsid_type as 3 instead of 0):
---
nfsd_dispatch: vers 3 proc 19
nfsd: FSINFO(3) 12: 00030001 0020c700 00000002 00000000 00000000 00000000
nfsd: fh_verify(12: 00030001 0020c700 00000002 00000000 00000000 00000000)
Jul 11 12:44:52 veritas2 kernel: RPC: svc_expkey_lookup: set=0, INPLACE=0, item
flags=3225511531
NFSD: svc_expkey_init: fsid_type=3, fsid[0]=2148096, fsid[1]=2,
fsid[2]=4125508608, client=f5387940
NFS: AG: exp_find_key: fsid_type=3, valid=0, negative=0, pending=0, flags=0
RPC: cache_check: flags=0 exp_time=1152636408 last_ref=1152636288
Want update, refage=120, age=0
NFSD: cache_make_upcall
NFSD: expkey_request: fsid_type=3, fsid[0]=2148096, fsid[1]=2,
fsid[2]=4125508608, client=f5387940
NFSD: � *bpp=3 type=<NULL>
NFSD: cache_make_upcall: len=4073
---
The response should come from svc_export_parse, i.e. when
/proc/net/rpc/nfsd.export is written to. But it is not. So the nfsd code so far
looks good, need to find out why this place is not written to.
Created attachment 132274 [details]
ethereal tracedumps
Network trace shows some intersting bits. During mount reply, mountd doesn't
return back a correct file handle. (or similar to the file handle for minor
number < 256).
Attached are tracedumps:
traces/trace-bad1-upcall - minor number 512
traces/trace-good-upcall - minor number 128
Some endian-ness or boolean operation issue, I guess. This is the reply by mountd, to the MNT call: For minor 126 - 01 00 00 00 00 c7 00 80 02 00 00 00 ----------- ----- ----- ----------- handle type major minor inode# (knfsd new) (199) (126) (2) where as for minor 512 - 01 00 03 00 00 c7 20 00 02 00 00 00 ----------- ----- ----- ----------- handle type major minor inode# (?) (199) (?) (2) file handle type should be same - 01 00 00 00 and minor number should be 512 - 02 00 Some data mishandling, either in mountd or in fh_compose. Modifications in interpreting the reply message (only relevant fields shown): 01 00 00 00 00 c7 00 80 02 00 00 00 -- -- ----- ----- ----------- A B major minor inode# (199) (126) (2) A: version = 1 B: fsid_type = 0, which says there are next 8 bytes which are major number (2 bytes), minor number (2 bytes) and inode numer (4 bytes) where as for minor 512 - 01 00 03 00 00 c7 20 00 02 00 00 00 -- -- ------------- ----------- A B device number inode# (2) A: version = 1 B: fsid_type = 3, which says there are next 8 bytes which are encoded as: 4 byte device number and 4 bytes inode number. Now, device number is encoded using new_dev_encode, which turns out to be 2148096 decimal, i.e. 20C700 hex. I don't know in what byte order it is represented above. Created attachment 132330 [details]
fix 1
This patch makes mounting possible and things work normally, but this is not a
fix, its just a work-around.
The problem is, for minor devices greater than 256, cache key is generated with
a different technique (new_ecnode_dev), and not everyone is aware of that it
seems. I strongly suspect nfs-utils is the culprit and doesn't handle the case
well.
Attached patches are against nfs-utils-1.0.9 and 2.6.9-40.ELsmp.
Pity me. The fix in the comment 24 is the right fix. I should've tested it before rediscovering the fix all the way :| But anyways, good learning experience :). I tested the machine with nfs-utils-1.0.9 on 2.6.9-40.ELsmp. Amit, will this get propogated into RHEL 5? EMC tested RHEL 5 alpha 1 and they stated they could see this issue in that version as well. This is a userspace patch, and not kernel, correct? Should this component be verified? This is a kernel patch. I just now checked the kernel devel cvs branch (2.6.18-rc1-git8), and it looks its fixed. But, I'm not sure if this is RHEL5 alpha 1 kernel. This is great news, thanks for the update! The RHEL 5 alpha was only 2.6.17-1, so that explains it. This is looking good for RHEL 5, as the betas will be later versions, -18 based. Please let me make clear about this issue against RHEL4. The patch in comment#24 is the right fix for kernel. Also, we need to backport something from nfs-utils-1.0.9, right ? No... The patch in Comment #24 is a kernel patch not a nfs-utils patch... We don't need to backport anything from nfs-utils-1.0.9. Although I tested with 1.0.9, earlier versions should also work right as its not the cause of the bug. committed in stream U5 build 42.11. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/ From Symantec: Hi, I picked up this - " kernel-2.6.9-42.15.EL.i686.rpm" from http://people.redhat.com/~jbaron/rhel4/ But the issue still remains :- On server :- ========= [root@vcslinux35 2.6.9-42.15.EL-i686]# uname -a Linux vcslinux35.vxindia.veritas.com 2.6.9-42.15.EL #1 Fri Sep 29 18:08:58 EDT 2006 i686 i686 i386 GNU/Linux [root@vcslinux35 2.6.9-42.15.EL-i686]# ls -l /dev/vx/dsk/fendg/vol brw------- 1 root root 199, 65534 Oct 4 16:04 /dev/vx/dsk/fendg/vol [root@vcslinux35 2.6.9-42.15.EL-i686]# mount /dev/sda1 on / type ext3 (rw) none on /proc type proc (rw) none on /sys type sysfs (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) usbfs on /proc/bus/usb type usbfs (rw) none on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) tmpfs on /dev/vx type tmpfs (rw,size=4k,nr_inodes=2097152,mode=0755) /dev/vx/dsk/fendg/vol on /mnt1 type ext3 (rw) nfsd on /proc/fs/nfsd type nfsd (rw) On client :- ========= [root@vcslinux119 ~]# showmount -e vcslinux35 Export list for vcslinux35: /mnt1 * [root@vcslinux119 ~]# mount vcslinux35:/mnt1 /mnt1 hangs forever... Please let me know if you need any more info Any update on this one please. We are still waiting on a fix. If you need more info please let us know Rita Rita, we are currently working on RHEL 5, and will start prioritizing 4.5 issues soon. Many partners have requested this fix, but we will need your help in testing when needed. Fixed and tested on RHEL4 U5 snapshot 5. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0304.html *** Bug 174385 has been marked as a duplicate of this bug. *** A Note for anyone else who stumbles on this bz. The corresponding nfs-utils patch was handled in bz 228868. The fix requires version nfs-utils-1.0.6-78 on the server. |