Bug 208585 - panic when freeing nfs locks of orphaned processes
Summary: panic when freeing nfs locks of orphaned processes
Status: CLOSED DUPLICATE of bug 218777
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel   
(Show other bugs)
Version: 4.4
Hardware: i386
OS: Linux
Target Milestone: ---
: ---
Assignee: Peter Staubach
QA Contact: Brian Brock
Depends On:
TreeView+ depends on / blocked
Reported: 2006-09-29 14:48 UTC by Ben Walton
Modified: 2008-06-18 15:51 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-06-18 15:51:02 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
logged output after this BUG() is triggered. (3.09 KB, text/plain)
2006-09-29 14:50 UTC, Ben Walton
no flags Details
logged output from kernel panic (with sas 8 patches applied) (2.97 KB, text/plain)
2006-10-02 19:41 UTC, Ben Walton
no flags Details

Description Ben Walton 2006-09-29 14:48:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20060909 Firefox/

Description of problem:
SAS 8 has a habit of not dying nicely all the time and leaves orphaned processes hanging around that have nfs locks open.  When these processes are killed, the kernel panics with the following (see attachment).  This sometimes occurs before the process is fully orphaned as well...(or the crash happens before I can run ps and see the task again.)

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. Start SAS 8 (displayed remotely over ssh tunneled X)
2. Kill -TERM <pid of sshd process>
3. Kill -TERM <pid of orphaned process>

Actual Results:
See attached log file (kernel panic BUG() trace).

Expected Results:
Process should be terminated cleanly and all be right with the world.

Additional info:
At first I thought that this was only triggered if the process was orphaned completely.  I have seen certain instances of this panic where the process is not fully orphaned before the system goes down.  It happens regularyly while people are using SAS.

For now, I've applied some patches to SAS that are supposed to help it clean up better if the X server dies beneath it.

I believe that this bug is related to the tgpid being -1 when the nfs locks are released.

Comment 1 Ben Walton 2006-09-29 14:50:23 UTC
Created attachment 137396 [details]
logged output after this BUG() is triggered.

Comment 2 Ben Walton 2006-10-02 19:41:39 UTC
Created attachment 137590 [details]
logged output from kernel panic (with sas 8 patches applied)

I thought that the sas8 patches had at least bandaided things well enough to
prevent this issue from happening, but got a lovely surprise this afternoon.

I hope it helps.


Comment 3 Ben Walton 2006-10-02 19:46:15 UTC
I could also add that this server was recently 'upgraded' from fc4 to as4. 
Prior to the upgrade, sas still left it's processes lying around, but this bug
was not triggered.  The nfs server is running fc4.


Comment 4 Trond H. Amundsen 2006-10-30 14:56:41 UTC
Just a "me too". We just experienced this bug on our general login machine (~180
simultaneous users, lots of NFS homedirs mounted). We were also running the
2.6.9-42.0.2.ELsmp kernel.


Comment 5 Ben Walton 2006-10-30 15:51:39 UTC
I experienced the same behaviour after moving to kernel 2.6.9-42.0.3.ELsmp also.
 I've had to return to the original fc4 setup as this is a high demand server
and I couldn't have this much downtime.  I have the as4 server live still in a
testing area and would be willing to run any tests necessary on it to help
resolve the issue.


Comment 6 Peter Staubach 2006-12-08 14:30:28 UTC
I believe that this is a duplicate of bz 218777.

Comment 7 Peter Staubach 2008-06-18 15:51:02 UTC
If someone is still seeing this problem, would they mind trying a
2.6.9-52 kernel or newer and if the problem is still occuring in
the newer kernels, than please reopen this bugzilla.

*** This bug has been marked as a duplicate of 218777 ***

Note You need to log in before you can comment on or make changes to this bug.