Red Hat Bugzilla – Bug 131235
clusvcadm -d <svcname> hangs forever when lsof for NFS filesystem hangs
Last modified: 2009-04-16 16:15:12 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1)
Description of problem:
1. RHAS 2.1 based system with clumanager 1.2.16
compiled for this kernel.
2. 2 member cluster with ip based tiebreaker
3. 2 services configured -each exporting one/more
devices with one/more ip addresses
4. Default service configuration is to run svc1
on host1 and svc2 on host2.
5. For load balancing access to exported disks is
needed on both hosts, host1 cross-mounts partitions
exported by svc2 using NFS hard mount. host2
crossmounts partitions exported by svc1 again using
NFS hard mount.
6. In case of failure on any one host, both svc1 and
svc2 will run on remaining host and all partitions
will be NFS mounted from that same host.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Steps to reproduce
1. service disable for svc2 on host2 works fine.
2. Now the ip address for svc2 is unavailable and service
disable svc1 uses lsof which hangs in
/proc/<pid_accessing_svc_partition>/cwd directory forever(
due to NFS server not responding on hard mount).
Hence service disable svc1 does not work.
3. However partitions belonging to svc1 are in
4. Even soft mounting the partitions did not help.
Actual Results: Generically, for any cluster host having atleast
one NFS mount (not necessarily related to cluster
configuration in any way) having that mount in
"server not responding" state: service stop hangs forever.
Expected Results: service stop should timeout or abort.
(1) Can lsof -b be tried to skip NFS mounted partitions?
This works when just NFS service is unavailable but
NFS ip is reachable. This does not work when nfs ip
itself is taken out.
(2) Can any kernel change be made to kill processes hanging
on NFS partitions without using lsof/fuser?
sample fstab entry for cross-mounted partition:
svc2:/cluster/disk2 /mnt/disk2 nfs
rw,bg,intr,timeo=1,retrans=3 0 0
Response to question (1): Yes, but you may have to add more grepping
for the cluster-managed device.
Note that the lsof code is from a time when fuser wasn't on many
distributions. The RPM spec for clumanager apparently doesn't have an
install dependency on psmisc; it probably should (it uses fuser and
(Why are you trying to use lsof instead of fuser?)
Response to question (2): There's always a way, but I doubt there's a
clean way. If a process is actually trying to touch an inaccessible
mount, you can't normally kill it. It goes in to disk-waite state
while waiting for the I/O to complete, where it can't be interrupted.
You can specify the -f flag to the umount command line; this should
work for NFS volumes, but not other data sources (e.g. block devices).
lsof is used by svclib_filesystem script. It uses lsof followed by
fuser. In this case however, both hang accessing
So using only fuser will also not solve the problem. Also as the
process is not directly related to the service being stopped, it could
be any process on that machine. Perhaps service stop should fork out the
child for fuser/lsof and kill it if it hangs for more than certain
interval. (We tested lsof -S also for lsof to timeout but even that hangs)
You're right, it does do lsof first.
Ok, we'll do lsof -b. It looks like fuser doesn't have any similar
method of operation.
Created attachment 103302 [details]
Patch to use lsof -b and _not_ use fuser if lsof exists
Patch is only against svclib_filesystem.
1. lsof -b | grep $dev doesn't get the device. It works on mounted
directory name grep.
2. Even lsof -b hangs in this case. Not sure there is another way
to timeout on lsof / forcefully umount.
If lsof -b hangs/blocks anyway in this case, then there's probably a
bug in lsof (the point of -b is to _not_ block...).
Created attachment 103342 [details]
Only use lsof when lsof exists (don't use fuser). Use -b + mount point instead of device
This change worked for me.
How I tested:
(1) Mount NFS mount on clustered server.
(2) Do a 'find' on NFS mount point. While 'find' is running, kill the
(3) Start up a shell in clustered service's mount point.
(4) Disable service.
The shell was properly killed, and nothing was hanging except the
Sep 1 10:47:25 magenta clusvcmgrd: : <notice> service notice:
Stopping service IP_disk_check_test ...
Sep 1 10:47:26 magenta clusvcmgrd: : <warning> service
warning: killing process 1952 (root bash /dev/sdd3)
Sep 1 10:47:31 magenta clusvcmgrd: : <notice> service notice:
Stopped service IP_disk_check_test ...
I also retried with no processes accessing the hung NFS mount point,
which also worked.
Does step 2) kill the NFS server, means all that 'clusvcadm -d <svc>'
does ie. on nfs server : nfs stop & start, ifconfig intf down, .. etc ?
The 'lsof -b' itself hangs for us in the case mentioned in the BUG.
Is that there is a version of lsof that doesn't hang with -b option ?
We are using lsof-4.51-2 .
Unexport/remove the interface the client was accessing.
The version that shipped with RHEL3 doesn't block; you should be able
to get the RPMs from RHN. That's probably the problem you're seeing.
1.2.18pre1 patch (unsupported; test only, etc.)
This includes the fix for this bug and a few others.
RHEL3 uses lsof-4.63-4. I tried this one, and this lsof -b hangs too!!
In your test plan, can you try preceding step 3) to step2):
(3) Start up a shell-script in clustered service's mount point.
(2) ..... kill the NFS server.
This may cause lsof to hang.
When a nfs service goes down( ifconfig intf down, nfs stop, exportfs
-u etc.) while some script was already accessing the nfs mounted
partition exported by that service, then lsof ( -b, ... ) hangs for us .
We are not using RHEL3, rather we are using RHAS2.1. Could be a kernel
Using the new patch anyway. It doesn't fix this bug for us.
You're correct; it actually just took me a few more tries to reproduce it.
The bug may be in lsof. Here's what I did (outside of the cluster
(1) Mount NFS export from another machine (hard mount, not soft)
(2) cd /new_nfs_mount; while [ 0 ]; do find . ; done
(3) Disable NFS export+ifdown interface and or reboot NFS server (so
that client goes into retry mode)
(4) lsof -b
Step (4) hangs.
I tried the above steps with the following combinations, all hung
after a few tries; it seems newer versions of lsof took more tries:
RHEL 2.1 + lsof 4.52
RHEL 2.1 + lsof 4.63
RHEL 3 + lsof 4.63
RHEL 3 + lsof 4.72
Fedora Core 2 + lsof 4.72
They hang while doing stat64, and go into disk-wait:
read(4, "30030 (bash) S 30025 30030 30030"..., 4096) = 224
close(4) = 0
munmap(0xb7298000, 4096) = 0
readlink("/proc/30030/cwd", "/mnt/tmp", 4096) = 8
I doubt fuser will do any better in this case.
In the worst case, the patch provided here fixes the fact that it was
using _both_ instead of one or the other for killing mount processes.
Sorry, 4.51, not 4.52 in above comment.
Bug #131712 opened against lsof.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.