Bug 1091102
Summary: | The pacemaker nfsserver resource agent's execution of sm-notify fails during startup | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | David Vossel <dvossel> |
Component: | resource-agents | Assignee: | David Vossel <dvossel> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 6.6 | CC: | agk, cluster-maint, djansa, fdinitto, jherrman, mnovacek, sbradley |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | resource-agents-3.9.5-8.el6 | Doc Type: | Bug Fix |
Doc Text: |
Previously, Pacemaker's nfsserver resource agent was unable to properly perform NFSv3 network status monitor (NSM) state notifications. As a consequence, NFSv3 clients could not reclaim file locks after server relocation or recovery. This update introduces the nfsnotify resource agent, thanks to which NSM notifications can be sent correctly, thus allowing NFSv3 clients to reclaim file locks.
|
Story Points: | --- |
Clone Of: | 1091101 | Environment: | |
Last Closed: | 2014-10-14 05:00:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1091101 | ||
Bug Blocks: |
Description
David Vossel
2014-04-24 22:04:45 UTC
There's an upstream pull request related to this issue. https://github.com/ClusterLabs/resource-agents/pull/420 *** Bug 1091474 has been marked as a duplicate of this bug. *** I have verified (using inscructions from comment #6) that sm-notify works correctly after nfs server failover with new nfs-notify resource agent from resource-agents-3.9.5-11.el6.x86_64. ---- nfs-client# mount | grep shared # mount | grep shared 10.34.70.136:/mnt/shared/1 on /exports/1 type nfs (rw,vers=3,addr=10.34.70.136) nfs-client# flock /exports/1/urandom -c 'sleep 10000' ... # tshark -i eth0 -R nlm Running as user "root" and group "root". This could be dangerous. Capturing on eth0 10.523062 10.34.71.133 -> 10.34.70.136 NLM 330 V4 LOCK Call FH:0x6c895d9c svid:137 pos:0-0 10.523350 10.34.70.136 -> 10.34.71.133 NLM 106 V4 LOCK Reply (Call In 52) <failover occurs> 29.301472 10.34.71.133 -> 10.34.70.136 NLM 330 V4 LOCK Call FH:0x6c895d9c svid:137 pos:0-0 32.303873 10.34.71.133 -> 10.34.70.136 NLM 330 V4 LOCK Call FH:0x6c895d9c svid:137 pos:0-0 32.332312 10.34.70.136 -> 10.34.71.133 NLM 106 V4 LOCK Reply (Call In 120) # tshark -i eth0 -R stat Running as user "root" and group "root". This could be dangerous. Capturing on eth0 <failover occurs> 27.793019 10.34.70.136 -> 10.34.71.133 STAT 142 V1 NOTIFY Call 27.793204 10.34.71.133 -> 10.34.70.136 STAT 66 V1 NOTIFY Reply (Call In 75) 27.793440 10.34.70.136 -> 10.34.71.133 STAT 142 V1 NOTIFY Call 27.793672 10.34.71.133 -> 10.34.70.136 STAT 66 V1 NOTIFY Reply (Call In 77) Obtaining another lock fails so the locks is still being held by the original process: nfs-client# flock --nonblock /exports/1/urandom -c 'sleep 10' nfs-client# echo $? 1 cluster configuration is as follows: virt-136# pcs status Cluster name: STSRHTS24129 Last updated: Fri Jul 25 19:03:58 2014 Last change: Fri Jul 25 18:55:14 2014 Stack: cman Current DC: virt-137 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured 10 Resources configured Online: [ virt-136 virt-137 ] Full list of resources: fence-virt-136 (stonith:fence_xvm): Started virt-136 fence-virt-137 (stonith:fence_xvm): Started virt-137 fence-virt-138 (stonith:fence_xvm): Started virt-136 Resource Group: hanfs mnt-shared (ocf::heartbeat:Filesystem): Started virt-136 nfs-daemon (ocf::heartbeat:nfsserver): Started virt-136 export-root (ocf::heartbeat:exportfs): Started virt-136 export0 (ocf::heartbeat:exportfs): Started virt-136 export1 (ocf::heartbeat:exportfs): Started virt-136 vip (ocf::heartbeat:IPaddr2): Started virt-136 nfs-notify (ocf::heartbeat:nfsnotify): Started virt-136 virt-136# pcs resource show hanfs Group: hanfs Resource: mnt-shared (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/shared/shared0 directory=/mnt/shared fstype=ext4 options= force_unmount=safe Operations: start interval=0s timeout=60 (mnt-shared-start-timeout-60) stop interval=0s timeout=60 (mnt-shared-stop-timeout-60) monitor interval=30s (mnt-shared-monitor-interval-30s) Resource: nfs-daemon (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_ip=10.34.70.136 nfs_shared_infodir=/mnt/shared/nfs nfs_no_notify=True Operations: start interval=0s timeout=40 (nfs-daemon-start-timeout-40) stop interval=0s timeout=20s (nfs-daemon-stop-timeout-20s) monitor interval=30s (nfs-daemon-monitor-interval-30s) Resource: export-root (class=ocf provider=heartbeat type=exportfs) Attributes: directory=/mnt/shared clientspec=* options=rw,sync fsid=304 Operations: start interval=0s timeout=40 (export-root-start-timeout-40) stop interval=0s timeout=120 (export-root-stop-timeout-120) monitor interval=10 timeout=20 (export-root-monitor-interval-10) Resource: export0 (class=ocf provider=heartbeat type=exportfs) Attributes: directory=/mnt/shared/0 clientspec=* options=rw,sync fsid=1 Operations: start interval=0s timeout=40 (export0-start-timeout-40) stop interval=0s timeout=120 (export0-stop-timeout-120) monitor interval=10 timeout=20 (export0-monitor-interval-10) Resource: export1 (class=ocf provider=heartbeat type=exportfs) Attributes: directory=/mnt/shared/1 clientspec=* options=rw,sync fsid=2 Operations: start interval=0s timeout=40 (export1-start-timeout-40) stop interval=0s timeout=120 (export1-stop-timeout-120) monitor interval=10 timeout=20 (export1-monitor-interval-10) Resource: vip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.34.70.136 cidr_netmask=23 Operations: start interval=0s timeout=20s (vip-start-timeout-20s) stop interval=0s timeout=20s (vip-stop-timeout-20s) monitor interval=30s (vip-monitor-interval-30s) Resource: nfs-notify (class=ocf provider=heartbeat type=nfsnotify) Attributes: source_host=pool-10-34-70-136.cluster-qe.lab.eng.brq.redhat.com Operations: start interval=0s timeout=90 (nfs-notify-start-timeout-90) stop interval=0s timeout=90 (nfs-notify-stop-timeout-90) monitor interval=30 timeout=90 (nfs-notify-monitor-interval-30) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1428.html |