Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 678497 - netfs.sh patch, when network is lost it takes too long to unmount the NFS filesystems
netfs.sh patch, when network is lost it takes too long to unmount the NFS fil...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: resource-agents (Show other bugs)
6.3
Unspecified Unspecified
low Severity low
: rc
: ---
Assigned To: Chris Feist
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-18 04:22 EST by Raul Mahiques
Modified: 2011-12-06 07:02 EST (History)
5 users (show)

See Also:
Fixed In Version: resource-agents-3.9.2-6.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 678494
Environment:
Last Closed: 2011-12-06 07:02:39 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Fix (1.92 KB, patch)
2011-08-03 19:27 EDT, Lon Hohberger
no flags Details | Diff
Fix, pass 2 (2.36 KB, patch)
2011-08-08 15:35 EDT, Lon Hohberger
no flags Details | Diff


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:1580 normal SHIPPED_LIVE Low: resource-agents security, bug fix, and enhancement update 2011-12-05 19:38:57 EST

  None (edit)
Description Raul Mahiques 2011-02-18 04:22:00 EST
+++ This bug was initially created as a clone of Bug #678494 +++

Description of problem:
With the current netfs.sh script when the network connection to an NFS server is lost the script takes longer than it could to unmount the FS.
Using "umount -f" before "fuser" will speed up the process if there is no process holding the mountpoint.


Version-Release number of selected component (if applicable):


How reproducible:
- Setup 2 or more NFS netfs resources in the cluster.
- Cut connectivity to the NFS share.


Steps to Reproduce:
1.Setup 2 or more NFS netfs resources in the cluster.
2.Cut connectivity to the NFS share.

  
Actual results:
It takes longer than it could to unmount the FS.

Expected results:
It unmounts the NFS filesystem quicker when there is no process holding it.

Additional info:
Comment 2 Lon Hohberger 2011-02-18 12:52:17 EST
This patch doesn't apply to RHEL6; we already do umount -f; all we need to do is move do_force_unmount() to -before- 'fuser -kvm' in fs-lib.sh
Comment 6 Lon Hohberger 2011-08-03 19:26:19 EDT
Test setup:
 * mount on crackle from 192.168.122.201
 * make 192.168.122.201 unavailable (in this case, disabled the service)
 * start 'touch /mnt/tmp/b' in one terminal
 * 'time ./netfst stop' in second terminal

Source of netfst:

[root@crackle ~]# cat netfst
#!/bin/sh

export OCF_RESKEY_name="foo"
export OCF_RESKEY_host="192.168.122.201"
export OCF_RESKEY_export="/mnt/gfs2"
export OCF_RESKEY_mountpoint="/mnt/tmp"
export OCF_RESKEY_force_unmount="1"

/usr/share/cluster/netfs.sh $1

Pre-patch 'stop' of netfst:


[root@crackle ~]# mount | grep /mnt/tmp
192.168.122.201:/mnt/gfs2 on /mnt/tmp type nfs (rw,sync,soft,noac,addr=192.168.122.201)
[root@crackle ~]# time ./netfst stop
<info>   unmounting /mnt/tmp
[netfs.sh] unmounting /mnt/tmp
umount.nfs: /mnt/tmp: device is busy
umount.nfs: /mnt/tmp: device is busy
<debug>  umount failed: 16
[netfs.sh] umount failed: 16
<warning>Sending SIGTERM to processes on /mnt/tmp
[netfs.sh] Sending SIGTERM to processes on /mnt/tmp
Cannot stat /mnt/tmp: Input/output error
Cannot stat /mnt/tmp: Input/output error
Cannot stat /mnt/tmp: Input/output error
<info>   unmounting /mnt/tmp
[netfs.sh] unmounting /mnt/tmp

real    15m23.828s
user    0m0.162s
sys     0m0.495s
[root@crackle ~]# echo $?
0
[root@crackle ~]# mount | grep /mnt/tmp
[root@crackle ~]# 

Post-patch results (test build of resource-agents w/ patch):


[root@crackle ~]# mount | grep /mnt/tmp
192.168.122.201:/mnt/gfs2 on /mnt/tmp type nfs (rw,sync,soft,noac,addr=192.168.122.201)
[root@crackle ~]# rpm -Uvh resource-agents-3.9.2-3.el6.x86_64.rpm
Preparing...                ########################################### [100%]
   1:resource-agents        ########################################### [100%]
[root@crackle ~]# time ./netfst stop
<info>   unmounting /mnt/tmp
[netfs.sh] unmounting /mnt/tmp
umount.nfs: /mnt/tmp: device is busy
umount.nfs: /mnt/tmp: device is busy
<debug>  umount failed: 16
[netfs.sh] umount failed: 16
<warning>Calling 'umount -f /mnt/tmp'
[netfs.sh] Calling 'umount -f /mnt/tmp'
<info>   192.168.122.201:/mnt/gfs2 is not mounted
[netfs.sh] 192.168.122.201:/mnt/gfs2 is not mounted

real    3m23.697s
user    0m0.154s
sys     0m0.305s
Comment 7 Lon Hohberger 2011-08-03 19:27:31 EDT
Created attachment 516591 [details]
Fix
Comment 8 Lon Hohberger 2011-08-03 19:29:18 EDT
Comment on attachment 516591 [details]
Fix

Patch was bad, caused regressions in other agents (fs.sh)
Comment 9 Lon Hohberger 2011-08-08 15:35:59 EDT
Created attachment 517291 [details]
Fix, pass 2

Improved patch which does not have the fs.sh regression.
Comment 10 Lon Hohberger 2011-08-08 15:44:29 EDT
Updated test result:

[root@crackle ~]# touch /mnt/tmp/b &
[1] 9369
[root@crackle ~]# time ./netfst stop
<info>   unmounting /mnt/tmp
[netfs.sh] unmounting /mnt/tmp
umount.nfs: /mnt/tmp: device is busy
umount.nfs: /mnt/tmp: device is busy
<debug>  umount failed: 16
[netfs.sh] umount failed: 16
<warning>Calling 'umount -f /mnt/tmp'
[netfs.sh] Calling 'umount -f /mnt/tmp'
touch: cannot touch `/mnt/tmp/b': Input/output error
<info>   192.168.122.201:/mnt/gfs2 is not mounted
[netfs.sh] 192.168.122.201:/mnt/gfs2 is not mounted
[1]+  Exit 1                  touch /mnt/tmp/b

real    3m23.518s
user    0m0.157s
sys     0m0.269s
[root@crackle ~]# echo $?
0

Running regression runs on fs.sh, but I think we're good.
Comment 11 Lon Hohberger 2011-08-08 15:52:55 EDT
netfs and fs regression runs passed on 3.9.1-3.el6
Comment 12 Lon Hohberger 2011-08-08 16:04:04 EDT
Patch pushed to upstream master:

https://github.com/ClusterLabs/resource-agents/commit/9af820f580691195378cb5bfd58a0a0cdb03802a

And posted to cluster-devel for inclusion in RHEL6 branch:

https://www.redhat.com/archives/cluster-devel/2011-August/msg00030.html
Comment 13 Lon Hohberger 2011-08-08 16:05:21 EDT
(In reply to comment #11)
> netfs and fs regression runs passed on 3.9.1-3.el6

Where 1 means 2 (3.9.2-3.el6).  This build was a test build.
Comment 17 errata-xmlrpc 2011-12-06 07:02:39 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1580.html

Note You need to log in before you can comment on or make changes to this bug.