Bug 151897

Summary: RHEL4: "lsof -b" hangs if a process is stuck in disk-wait
Product: Red Hat Enterprise Linux 4 Reporter: Karel Zak <kzak>
Component: lsofAssignee: Karel Zak <kzak>
Status: CLOSED ERRATA QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: dff, laroche
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2006-0229 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-03-22 16:45:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 187538    

Description Karel Zak 2005-03-23 11:36:21 UTC
+++ This bug was initially created as a clone of Bug #131712 +++

Description of problem:
'lsof -b', hangs in disk-wait if another process is also stuck in
disk-wait.

Version-Release number of selected component (if applicable): 4.72-1

How reproducible: ~10-30% on first try.  Often takes several
iterations of step (4) below.

Steps to Reproduce:
1. Mount NFS export from another machine (hard mount, not soft)
2. cd /new_nfs_mount; while [ 0 ]; do find . ; done
3. Disable NFS export+ifdown interface and/or reboot NFS server 
4. Run 'lsof -b' on client

  
Actual results:
Step (4) hangs while doing stat64; lsof process goes into disk-wait:

read(4, "30030 (bash) S 30025 30030 30030"..., 4096) = 224
close(4)                                = 0
munmap(0xb7298000, 4096)                = 0
readlink("/proc/30030/cwd", "/mnt/tmp", 4096) = 8
stat64("/proc/30030/cwd",


Expected results:
No hang.


Additional info:
I tried the above steps with the following combinations, all hung
after a few tries.

RHEL 2.1 + lsof 4.52
RHEL 2.1 + lsof 4.63
RHEL 3 + lsof 4.63
RHEL 3 + lsof 4.72
Fedora Core 2 + lsof 4.72

Could it possibly a kernel bug?  The original process goes into disk
wait, while 'lsof -b' subsequently succeeds a few times (usually)
before hanging.

Comment 10 Florian La Roche 2005-10-04 14:03:36 UTC
removing old acks

Comment 18 Red Hat Bugzilla 2006-03-22 16:45:10 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0229.html