Bug 451057
| Summary: | [RHEL 5] lockd not aware of statd -n switch | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 5 | Reporter: | Janne Karhunen <jkarhune> | ||||
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> | ||||
| Status: | CLOSED ERRATA | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | high | ||||||
| Version: | 5.2 | CC: | cward, janne.karhunen, jlayton, kari.hautio, kvolny, pcfe, peterm, rdoty, staubach | ||||
| Target Milestone: | rc | Keywords: | OtherQA | ||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2009-01-20 21:01:20 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 409021 | ||||||
| Attachments: |
|
||||||
|
Description
Janne Karhunen
2008-06-12 16:30:00 UTC
Created attachment 309099 [details]
lockd callback fix for rhel5
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Patch tested on rhel5. Works fine. What needs to be done to get this patch posted? We are getting close to the deadline. Fixed in nfs-utils-1.0.9-36.el5 ad comment #4, please provide the details about the package set which was tested; or clarify the reproducer setup Janne, as Comment #4 is too vague and we now have RHEL 5.3 Beta available on london, please test again and report back with the versions you tested. [the below will make perfect sense to Janne but no sense at all to people with no access to the lab in Helsinki] If you need machines for the testing, I have reserved stitch 3/8 slots 9 through 14 for you. That is 3 CPU blades, each with their own HD, each pair on a separate loop. You can access them as fargo 236. 237 and 238. Please do tell if you reconfigure the fibre loop for this test. [Helsinki part ends] It is imperative that you test and report back. PCFE First you need to set up NFS server failover pair (with migrating server IP address, NOTIFY LIST and UNIFORM SERVER_NAME on both nodes) and have two clients connected to the server. Then you need a basic application that grabs a lock over NFS. Run locktest application on the client once server1 is up. Watch with ethereal that standard notify links are established both ways (tshark -R stat). Fail over server1 to server2. Watch that server notifies client that server 'rebooted' and verify that client retakes the lock. Double check with client2 that client1 is indeed still holding that lock and no duplicate locks are granted. Provided that patch is not in you will see that server statd notification towards lockd is dropped as request comes in via unknown interface *IF* statd had 'correctly' bound corresponding interface (the one SERVER_NAME refers to). So it may also accidentally work if migrated interface is not up at the time when statd starts (or precisely sm-notify on rhel5) as IP_FREEBIND is enabled by default. HTH; Janne, as discused earlier, in case you manage to squeeze in testing forr this next week all of stitch8 is now reserved for you. The reservation of the blades in stitch8 has been revoked. You can change the fibre loop config in stitch8 to suit your needs, no one else is on that chassis. http://london.fp.nsn-rdnet.net/LondontestNetwork/stitch/stitch_slot_assignment.html shows you the console ports. As we're not using RHEL5 yet, you have RHEL cluster suite with NFS failover available? as discussed with Janne before he left. Yes CS is available on those nodes through yum. But recommend you do your test just like in June instead of learning an all new product and getting potential side-effects you do not expect. If after the w-e you reconsidered and want to use CS, just have a peek at http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/html/Cluster_Administration/index.html and install the packages with yum. You can see the contents of the channel easiest on london under /var/ftp/pub/RHEL5.3-Beta/Server/i386/DVD/Cluster/ As far as comment #4 goes I might have only tested that it doesn't bind anything 'illegal' on RHEL5 (can't remember). Proper testing with scenario #13 is probably completely undone and will probably show up bunch of new bugs :-/ [root@stitch-8-1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.10.10.125 stitch-8-1.testing.local stitch-8-1
192.168.1.1 stitch-nfs.testing.local stitch-nfs
[root@stitch-8-1 ~]# ip a |grep inet
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
inet 192.10.10.125/24 brd 192.10.10.255 scope global eth0
inet 192.168.1.1/24 brd 192.168.1.255 scope global eth0:1
inet6 fe80::20e:cff:fe52:d71e/64 scope link
[root@stitch-8-1 ~]# rpc.statd -n stitch-nfs
[root@stitch-8-1 ~]#
[root@stitch-8-1 ~]# netstat -anp|grep statd
tcp 0 0 0.0.0.0:880 0.0.0.0:* LISTEN 2394/rpc.statd
udp 0 0 0.0.0.0:874 0.0.0.0:* 2394/rpc.statd
udp 0 0 0.0.0.0:877 0.0.0.0:* 2394/rpc.statd
unix 2 [ ] DGRAM 11120 2394/rpc.statd
[root@stitch-8-1 ~]# ps -ef |grep statd
root 2394 1 0 15:39 ? 00:00:00 rpc.statd -n stitch-nfs
[root@stitch-8-1 ~]# rpm -qa |grep nfs
nfs-utils-lib-1.0.8-7.2.z2
nfs-utils-1.0.9-38.el5
Partners, this bug should be fixed in the latest RHEL 5.3 Snapshot. We believe that you have some interest in its correct functionality, so we're making a friendly request to send us some testing feedback. If you have a chance to test it, please share with us your findings. If you have successfully VERIFIED the fix, please add PartnerVerified to the Bugzilla keywords, along with a description of the results. Thanks! ~~~ Attention Partners ~~~ The *last* RHEL 5.3 Snapshot 6 is now available at partners.redhat.com. A fix for this bug should be present. Please test and update this bug with test results as soon as possible. If the fix present in Snap6 meets all the expected requirements for this bug, please add the keyword PartnerVerified. If any new bugs are discovered, please CLONE this bug and describe the issues encountered there. Janne Karhunen to please test and report back as asked in Comment #27 The fix itself has been tested with beta in Comment #18. As only minor changes to other functionality than statd has been done to nfs-utils since that we see that this test is still valid. nfs-utils: nfs-utils-1.0.9-38.el5 -> nfs-utils-1.0.9-40.el5 ----------------------------------------------------------- Wed Nov 12 2008 Steve Dickson <steved> 1.0.9-40 - Fixed arguments to the hosts_ctl() call in the good_client() routine used in the tcpwrapper support. (bz 440120) Tue Nov 11 2008 Steve Dickson <steved> 1.0.9-39 - Fixed typo in nfs initscript that caused rpc.rquotad daemons to be started but not stoppped (bz 470483) An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0107.html |