Bug 57940

Summary: 'top' and 'ps' hang, possibly while accessing /proc
Product: [Retired] Red Hat Linux Reporter: Jason M. Sullivan <jsullivan>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.2   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2002-01-02 21:00:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason M. Sullivan 2002-01-02 21:00:01 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.6) Gecko/20011120

Description of problem:
For reasons as yet undetermined, 'top' and 'ps' hang in a system that seems
otherwise stable.  Coincident with 'top' and 'ps' starting to hang, a wine
session running lotus notes locks up.

Doing an 'ls -lR' of the /proc file system causes a lockup, as well.  More
specifically, it seems to hang while listing out the directory of a
particular process ID.  I'd wager that that directory belongs to the wine
process that locks up.

All these lockups are unkillable (even using -9).  The machine must be
rebooted (a tricky problem, considering /proc is corrupt).

So, what's the next step in trying to fix this?

Version-Release number of selected component (if applicable):


How reproducible:
Sometimes

Steps to Reproduce:
1.Run Lotus Notes under WINE
2.Wait for it to hang.
3.Wail and gnash teeth.
	

Actual Results:  /proc gets corrupted, and things that depend on it start
hanging.

Expected Results:  Well, Ideally, nothing.  Notes under WINE shouldn't
hang, either, but I'll tackle that locally (since it's a local version of
WINE and notes).

Additional info:

I'm filing this as a kernel bug, because the /proc filesystem is the
responsibility of the kernel, and userspace programs shouldn't be able to
corrupt it like this.

Comment 1 Arjan van de Ven 2002-02-11 16:20:25 UTC
A deadlock in this area has been fixed in the 2.4.9-21 kernel, released as
erratum.

Comment 2 Jason M. Sullivan 2002-06-27 19:33:05 UTC
Using this has fixd it for some machines, but not for others (some of them lock
up more now).  Where can I find documentation on the errata, and how can this be
debugged?