Bug 1024065
Summary: | netfs unmount/self_fence integration | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | David Vossel <dvossel> |
Component: | resource-agents | Assignee: | David Vossel <dvossel> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.6 | CC: | agk, cluster-maint, djansa, fdinitto, jcastillo, jharriga, jpokorny, jruemker, jsvarova, luvilla, michele, mnovacek |
Target Milestone: | rc | Keywords: | Reopened, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | 0day | ||
Fixed In Version: | resource-agents-3.9.2-41.el6 | Doc Type: | Bug Fix |
Doc Text: |
Prior to this update, the netfs agent could hang during a stop operation, even with the self_fence option enabled. With this update, self fence operation is executed sooner in the process, which ensures that NFS client detects server leaving if umount can not succeed, and self fencing occurs.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2014-10-14 04:59:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1010423, 1027410, 1055424, 1117032 |
Description
David Vossel
2013-10-28 18:00:35 UTC
upstream patch related to this issue. https://github.com/davidvossel/resource-agents/commit/617e52862264e07dce5c0a1b2c693a9073458341 I have verified that with resource-agents-3.9.5-11.el6.x86_64 the node self fence when trying to umount unreacheable nfs mount. # export \ OCF_FUNCTIONS_DIR=/usr/lib/ocf/lib/heartbeat \ OCF_RESKEY_name=nfsmount \ OCF_RESKEY_host=10.34.70.155 \ OCF_RESKEY_mountpoint=/mnt \ OCF_RESKEY_export=/mnt/shared0 \ OCF_RESKEY_fstype=nfs OCF_RESKEY_self_fence=yes # /usr/share/cluster/netfs.sh start # mount | grep shared 10.34.70.155:/mnt/shared0 on /mnt type nfs (rw,sync,soft,noac,vers=4,addr=10.34.70.155,clientaddr=10.34.71.133) # ssh 10.34.70.155 "iptables -I INPUT 1 -s $(hostname -f) -j DROP" # date; /usr/share/cluster/netfs.sh stop Wed Jul 23 15:18:25 CEST 2014 <info> pre unmount: checking if nfs server 10.34.70.155 is alive [netfs.sh] pre unmount: checking if nfs server 10.34.70.155 is alive <debug> Testing generic rpc access on server 10.34.70.155 with protocol tcp [netfs.sh] Testing generic rpc access on server 10.34.70.155 with protocol tcp <alert> RPC server on 10.34.70.155 with tcp is not responding [netfs.sh] RPC server on 10.34.70.155 with tcp is not responding <alert> NFS server not responding - REBOOTING [netfs.sh] NFS server not responding - REBOOTING </var/log/messages shows the following> ... Jul 23 15:17:55 virt-133 kernel: FS-Cache: Loaded Jul 23 15:17:55 virt-133 kernel: NFS: Registering the id_resolver key type Jul 23 15:17:55 virt-133 kernel: FS-Cache: Netfs 'nfs' registered for caching Jul 23 15:21:52 virt-133 kernel: nfs: server 10.34.70.155 not responding, timed out Jul 23 15:21:52 virt-133 rgmanager[2291]: [netfs.sh] pre unmount: checking if nfs server 10.34.70.155 is alive Jul 23 15:22:55 virt-133 rgmanager[2349]: [netfs.sh] RPC server on 10.34.70.155 with tcp is not responding Jul 23 15:22:55 virt-133 rgmanager[2353]: [netfs.sh] NFS server not responding - REBOOTING Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-1428.html |