Bug 228076 - kernel spinlock panic in inode.c
Summary: kernel spinlock panic in inode.c
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.4
Hardware: All
OS: Linux
urgent
medium
Target Milestone: ---
: ---
Assignee: Peter Staubach
QA Contact: Brian Brock
URL:
Whiteboard:
: 231225 (view as bug list)
Depends On: 245197
Blocks: 222397 234251 240855 245198
TreeView+ depends on / blocked
 
Reported: 2007-02-09 21:19 UTC by Kathy Whyte
Modified: 2018-10-19 19:06 UTC (History)
5 users (show)

Fixed In Version: RHBA-2007-0791
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-15 16:19:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Proposed patch (1.11 KB, patch)
2007-02-21 20:09 UTC, Peter Staubach
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0791 0 normal SHIPPED_LIVE Updated kernel packages available for Red Hat Enterprise Linux 4 Update 6 2007-11-14 18:25:55 UTC

Description Kathy Whyte 2007-02-09 21:19:52 UTC
Description of problem: System panics with message of "Kernel panic - not
syncing: fs/nfs/inode.c:598: spinlock(fs/inode.c:e2508bb68) already locked by
fs/nfs/inode.c/1043"   Or very similar message.

Version-Release number of selected component (if applicable):
2.6.9-42.03.EL

How reproducible: Random


Steps to Reproduce: can't reproduce at will
1.
2.
3.
  
Actual results: System hangs/crashes and message above shows on console


Expected results: system to not crash/hang.


Additional info: Not actually running NFS, but are running Sharity to mount
windows DFS via CIFS/NFS type emulation.

Comment 1 Peter Staubach 2007-02-21 14:44:08 UTC
Is there any idea what the system is doing when these sporadic system
panics occur?

Comment 2 Kathy Whyte 2007-02-21 15:44:25 UTC
Unfortunately no.
Is there a way I can find out?

And unfortunately I don't recall which system reboot was for which issue I'm
having with the systems.

Interestingly, I don't think it has happened since I filed the report...

I don't have a core file either... not sure it generates one, but possibly not
enabled.  I don't recall how to check if system crash core is enabled or how to
enable it.

Kathy Whyte

Comment 3 Peter Staubach 2007-02-21 16:28:38 UTC
Well, without some information about the situation, I don't see that
there is much that I can do.  I haven't seen this situation on any
my systems nor heard anyone else complain about the problem, so I
don't know where to start looking.

From the message segments, it appears that on a uniprocessor system,
a spinlock was found to be locked when an attempt was made to lock it.
On a uniprocessor system, all spinlocks should be effectively no-ops
and there should be no contention for them.  Hence all spinlocks, on
these systems, should be acquired and then released before a process
context switch is made.

How about if I close this bugzilla and if it happens again and more
information is available, then this bugzilla can be reopened and I
will look at it more then?

Comment 4 Kathy Whyte 2007-02-21 16:48:11 UTC
Can you first tell me how to get more information: for example possibly
obtaining a crash or "kernal" dump?

I certainly understand the situation about not being able to do anything without
more information as I too am a computer support/system administrator type.

Comment 5 Peter Staubach 2007-02-21 17:50:23 UTC
I think that the diskdump support is what you are looking for.

Comment 6 Peter Staubach 2007-02-21 17:51:31 UTC
I could also use the entire and exact message which is printed when
the system fails like this.

Comment 7 Kathy Whyte 2007-02-21 17:59:26 UTC
That's in my problem description....  my reference to very similar is that 
"Kernel panic - not syncing: fs/nfs/inode.c:598: spinlock(fs/inode.c:e2508bb68)
already locked by fs/nfs/inode.c/1043"                               ^^^^^^^^^

The hex code above the ^^^(carets) is the only difference in the message.

I'll look into diskdump.

Thanks.

Kathy Whyte

Comment 8 Peter Staubach 2007-02-21 18:39:42 UTC
I guess that I wasn't sure what, "Or very similar message.", meant
and whether that implied that the numbers may or may not be the
exact ones.  They also weren't lining up with the ones from my
current RHEL-4 source, so I was a little concerned.

I grabbed a copy of the source corresponding to the 42.0.3 source
and now the line numbers match up with expected calls.

Comment 9 Peter Staubach 2007-02-21 18:41:54 UTC
So, a little analysis shows the code path which can trigger this
panic.  It seems that when the NFS client attempted to update
the attributes of a file, the file type had changed.  File types
are not supposed to change for the life of a specific file.

Given this though, the kernel should handle this error correctly
and not panic itself.

What was the NFS server for the client which panic'd?

Comment 10 Peter Staubach 2007-02-21 19:06:23 UTC
This problem appears to have been fixed upstream, so I will port back
those changes.  This will keep the client from panicing from this
situation in the future.

There is still a problem when the server changes the type of a file
which is associated with a particular file handle.  For NFSv2 and
NFSv3, file handles are supposed to be persistent and files are not
allowed to change type once they are created.

Comment 11 Peter Staubach 2007-02-21 20:09:05 UTC
Created attachment 148532 [details]
Proposed patch

Comment 13 Kathy Whyte 2007-02-21 21:38:23 UTC
My employer's policy is that I am not allowed to customize the kernel myself.
However, if you supplied me with a customized rpm, I could install it without
getting into trouble from my employer.  Would this be a possibility?

Thanks,
Kathy Whyte

Comment 14 Peter Staubach 2007-02-21 22:12:00 UTC
I don't currently have a way of building or supplying you with an
updated RPM, I'm sorry.  I am but a lowly development engineer...  :-)

Comment 15 Peter Staubach 2007-02-21 22:13:01 UTC
By the way, what is the NFS server that the client has mounted when the
client panics in this fashion?

Comment 16 Kathy Whyte 2007-02-21 23:14:47 UTC
If you can't do this, could you possibly put me in touch or re-assign to someone
who can?

I am trying to find more specifics about our NFS server before I reply on that
one.  Sorry for not acknowledging that before.

THANKS!

Kathy Whyte

Comment 17 Kathy Whyte 2007-03-02 14:04:11 UTC
Issue continues in 2.6.9-42.0.10...

Exact error as follows:
Fs/nfs/inode.c: 598: spin_lock(fs/inode.c:f55901fc) already locked by
fs/nfs/inode.c/1043



Comment 18 Peter Staubach 2007-03-02 15:14:38 UTC
Right.  The patch below has not been included into any official RHEL-4
kernel build yet.  It is targeted at RHEL-4.6.

Comment 19 Kathy Whyte 2007-03-02 15:45:11 UTC
Understood, but could I get the patch for the src.rpm for that version?

Kathy Whyte

Comment 20 Kathy Whyte 2007-03-02 15:58:30 UTC
We do, but I believe that we found that the code was pretty different between
the patch you provided for 2.6.9-42.0.3 vs. any other revisions so the patch was
not straight forward for us since we are not kernel patch experts (or even coders).

Thanks,
Kathy Whyte

Comment 21 Kathy Whyte 2007-03-02 16:01:34 UTC
Sorry that last item didn't make sense out of context.  I didn't realize I was
responding to a personal email vs. the bugzilla.

Here's the email I was responding to...

Hi.

Do you have the source rpm already?  If so, I would suggest just attempting to
apply the patch.  If that doesn't work, please let me know.

    Thanx...

       ps


Comment 22 Jeff Layton 2007-03-06 21:17:13 UTC
*** Bug 231225 has been marked as a duplicate of this bug. ***

Comment 23 RHEL Program Management 2007-04-18 22:29:42 UTC
This request was evaluated by Red Hat Kernel Team for inclusion in a Red
Hat Enterprise Linux maintenance release, and has moved to bugzilla 
status POST.

Comment 31 Jason Baron 2007-05-23 18:35:27 UTC
committed in stream U6 build 55.4. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 32 Issue Tracker 2007-06-26 01:30:37 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Tech
Resolution set to: 'RHEL 4.6'
Summary edited.

This event sent from IssueTracker by mmatsuya 
 issue 115150

Comment 35 errata-xmlrpc 2007-11-15 16:19:50 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0791.html



Note You need to log in before you can comment on or make changes to this bug.