Bug 151897 - RHEL4: "lsof -b" hangs if a process is stuck in disk-wait
Summary: RHEL4: "lsof -b" hangs if a process is stuck in disk-wait
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lsof (Show other bugs)
(Show other bugs)
Version: 4.0
Hardware: All Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Karel Zak
QA Contact: Brock Organ
URL:
Whiteboard:
Keywords:
Depends On:
Blocks: 187538
TreeView+ depends on / blocked
 
Reported: 2005-03-23 11:36 UTC by Karel Zak
Modified: 2007-11-30 22:07 UTC (History)
2 users (show)

Fixed In Version: RHBA-2006-0229
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-03-22 16:45:10 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2006:0229 normal SHIPPED_LIVE lsof bug fix update 2006-03-22 05:00:00 UTC

Description Karel Zak 2005-03-23 11:36:21 UTC
+++ This bug was initially created as a clone of Bug #131712 +++

Description of problem:
'lsof -b', hangs in disk-wait if another process is also stuck in
disk-wait.

Version-Release number of selected component (if applicable): 4.72-1

How reproducible: ~10-30% on first try.  Often takes several
iterations of step (4) below.

Steps to Reproduce:
1. Mount NFS export from another machine (hard mount, not soft)
2. cd /new_nfs_mount; while [ 0 ]; do find . ; done
3. Disable NFS export+ifdown interface and/or reboot NFS server 
4. Run 'lsof -b' on client

  
Actual results:
Step (4) hangs while doing stat64; lsof process goes into disk-wait:

read(4, "30030 (bash) S 30025 30030 30030"..., 4096) = 224
close(4)                                = 0
munmap(0xb7298000, 4096)                = 0
readlink("/proc/30030/cwd", "/mnt/tmp", 4096) = 8
stat64("/proc/30030/cwd",


Expected results:
No hang.


Additional info:
I tried the above steps with the following combinations, all hung
after a few tries.

RHEL 2.1 + lsof 4.52
RHEL 2.1 + lsof 4.63
RHEL 3 + lsof 4.63
RHEL 3 + lsof 4.72
Fedora Core 2 + lsof 4.72

Could it possibly a kernel bug?  The original process goes into disk
wait, while 'lsof -b' subsequently succeeds a few times (usually)
before hanging.

Comment 10 Florian La Roche 2005-10-04 14:03:36 UTC
removing old acks

Comment 18 Red Hat Bugzilla 2006-03-22 16:45:10 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2006-0229.html



Note You need to log in before you can comment on or make changes to this bug.