Bug 131712
Summary: | "lsof -b" hangs if a process is stuck in disk-wait | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Lon Hohberger <lhh> |
Component: | lsof | Assignee: | Karel Zak <kzak> |
Status: | CLOSED RAWHIDE | QA Contact: | Brock Organ <borgan> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | astrand, camarathe |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-03-23 11:53:57 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Lon Hohberger
2004-09-03 14:20:46 UTC
lsof Readme: NFS Blocks ========== Lsof is susceptible to NFS blocks when it tries to lstat() mounted file systems and when it does further processing -- lstat() and readlink() -- on its optional file and file system arguments. Lsof tries to avoid being stopped completely by NFS blocks by doing the lstat() and readlink() functions in a child process, which returns the function response via a pipe. The lsof parent limits the wait for data to arrive in the pipe with a SIGALRM, and, if the alarm trips, terminates the child process with a SIGINT and a SIGKILL. This is as reliable and portable a method for breaking NFS deadlocks as I have found, although it still fails under some combinations of NFS version, UNIX dialect, and NFS file system mount options. It generally succeeds when the "intr" or "soft" mount options are used; it generally fails when the "hard" mount option is used. When lsof cannot kill the child process, a second timeout causes it to stop waiting for the killed child to complete. While the second timeout allows lsof to complete, it may leave behind a hung child process. Unless warnings are inhibited by default or with the -w option, lsof reports the possible hung child. NFS block handling was updated with suggestions made by Andreas Stolcke. Andreas suggested using the alternate device numbers that appear in the mount tables of some dialects when it is not possible to stat(2) the mount points. The -b option was added to direct lsof to avoid the stat(2) and readlink(2) calls that might block on NFS mount points and always use the alternate device numbers. If warning message issuance is enabled and you don't want warning messages about what lsof is doing, use the -w option, too. The -O option directs lsof to avoid doing the potentially blocking operations in child processes. Instead, when -O is specified, lsof does them directly. While this consumes far less system overhead, it can cause lsof to hang, so I advise you to use -O sparingly. Maybe linux port of lsof bad detects NFS in the /proc/mounts. Can you send me your "cat /proc/mounts" from your system? |