Bug 394761

Summary: sosreport hangs running lsof
Product: Red Hat Enterprise Linux 5 Reporter: Bryn M. Reeves <bmr>
Component: sosAssignee: Navid Sheikhol-Eslami <neslami>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.1CC: agk, bmr
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-22 10:16:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Bryn M. Reeves 2007-11-21 19:15:07 UTC
Description of problem:
The current sosreport passes the "-M" flag to lsof to prevent potential blocking
operations but that is insufficient. The sosreport progress meter also displays
100%, despite the fact that no progress is happening:

 Progress [###################100%##################][05:36/05:36]
 Progress [###################100%##################][05:40/05:40]

[...]

 Progress [###################100%##################][26:42/26:42]

13673 pts/8    Ss     0:00  |   \_ -bash
13711 pts/8    Sl+    0:02  |   |   \_ python /usr/sbin/sosreport
13908 pts/8    S+     0:00  |   |       \_ /usr/sbin/lsof -b +M -n -l
25471 pts/8    Z+     0:00  |   |       \_ [cp] <defunct>

I ran into this situation on a system where an admin had run a "umount -lf" of
an unresponsive NFS file system. This means that the mount is detached from the
namespace and "lazy-umounted" when all references to it close. Since entries in
/proc/<pid> directories may still point at entries in this file system the lsof
spawned by sosreport may block accessing them.

Version-Release number of selected component (if applicable):
sos-1.7-9.1.el5

How reproducible:
100% on this host.

Steps to Reproduce:
[ one example - there are probably lots more ]
1. Mount an nfs file system
2. Make the server go away
3. Try an operation on the fs (e.g. ls /mount/point)
4. Lazy/background umount (umount -fl /mount/point)
5. Run sosreport
  
Actual results:
sosreport appears to hang waiting for lsof. Progress reaches 100% but the clock
keeps on going up.


Expected results:
sosreport should never hang when run with the default options.

Comment 1 Navid Sheikhol-Eslami 2007-11-22 10:16:07 UTC

*** This bug has been marked as a duplicate of 368261 ***