Bug 131712

Summary:	"lsof -b" hangs if a process is stuck in disk-wait
Product:	[Fedora] Fedora	Reporter:	Lon Hohberger <lhh>
Component:	lsof	Assignee:	Karel Zak <kzak>
Status:	CLOSED RAWHIDE	QA Contact:	Brock Organ <borgan>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	rawhide	CC:	astrand, camarathe
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2005-03-23 11:53:57 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Lon Hohberger 2004-09-03 14:20:46 UTC

Description of problem:
'lsof -b', hangs in disk-wait if another process is also stuck in
disk-wait.

Version-Release number of selected component (if applicable): 4.72-1

How reproducible: ~10-30% on first try.  Often takes several
iterations of step (4) below.

Steps to Reproduce:
1. Mount NFS export from another machine (hard mount, not soft)
2. cd /new_nfs_mount; while [ 0 ]; do find . ; done
3. Disable NFS export+ifdown interface and/or reboot NFS server 
4. Run 'lsof -b' on client

  
Actual results:
Step (4) hangs while doing stat64; lsof process goes into disk-wait:

read(4, "30030 (bash) S 30025 30030 30030"..., 4096) = 224
close(4)                                = 0
munmap(0xb7298000, 4096)                = 0
readlink("/proc/30030/cwd", "/mnt/tmp", 4096) = 8
stat64("/proc/30030/cwd",


Expected results:
No hang.


Additional info:
I tried the above steps with the following combinations, all hung
after a few tries.

RHEL 2.1 + lsof 4.52
RHEL 2.1 + lsof 4.63
RHEL 3 + lsof 4.63
RHEL 3 + lsof 4.72
Fedora Core 2 + lsof 4.72

Could it possibly a kernel bug?  The original process goes into disk
wait, while 'lsof -b' subsequently succeeds a few times (usually)
before hanging.

Comment 1 Karel Zak 2005-01-17 13:30:15 UTC

lsof Readme:

NFS Blocks
==========

Lsof is susceptible to NFS blocks when it tries to lstat() mounted
file systems and when it does further processing -- lstat() and
readlink() -- on its optional file and file system arguments.

Lsof tries to avoid being stopped completely by NFS blocks by doing
the lstat() and readlink() functions in a child process, which
returns the function response via a pipe.  The lsof parent limits
the wait for data to arrive in the pipe with a SIGALRM, and, if
the alarm trips, terminates the child process with a SIGINT and a
SIGKILL.

This is as reliable and portable a method for breaking NFS deadlocks
as I have found, although it still fails under some combinations
of NFS version, UNIX dialect, and NFS file system mount options.
It generally succeeds when the "intr" or "soft" mount options are
used; it generally fails when the "hard" mount option is used.

When lsof cannot kill the child process, a second timeout causes
it to stop waiting for the killed child to complete.  While the
second timeout allows lsof to complete, it may leave behind a hung
child process.  Unless warnings are inhibited by default or with
the -w option, lsof reports the possible hung child.

NFS block handling was updated with suggestions made by Andreas
Stolcke.  Andreas suggested using the alternate device numbers that
appear in the mount tables of some dialects when it is not possible
to stat(2) the mount points.

The -b option was added to direct lsof to avoid the stat(2) and
readlink(2) calls that might block on NFS mount points and always
use the alternate device numbers.  If warning message issuance is
enabled and you don't want warning messages about what lsof is
doing, use the -w option, too.

The -O option directs lsof to avoid doing the potentially blocking
operations in child processes.  Instead, when -O is specified, lsof
does them directly.  While this consumes far less system overhead,
it can cause lsof to hang, so I advise you to use -O sparingly.

Comment 2 Karel Zak 2005-01-17 14:34:26 UTC

Maybe linux port of lsof bad detects NFS in the /proc/mounts. Can you
send me your "cat /proc/mounts" from your system?