Bug 436087 - panic in shrink
Summary: panic in shrink
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 8
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
Depends On:
TreeView+ depends on / blocked
Reported: 2008-03-05 11:32 UTC by JW
Modified: 2008-03-07 03:37 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-03-07 02:40:34 UTC

Attachments (Terms of Use)

Description JW 2008-03-05 11:32:49 UTC
Description of problem:
Kernel panics during modest disk activity

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. create ext3 journaled filesystem
2. create lots of files
3. wait for kernel to crash
Actual results:
kernel crashes in at least two possible ways

Expected results:
kernel must not crash! kernel must be stress tested before being released!!

Additional info:
Instance 1:
  EIP: list_del+0x26/0x5d

Instance 2:
  journal_try_to_free_buffer+0x5c/0x137 [jbd]
  ext3_realease_page+0x0/0x7b [ext3]
  EIP: journal_grab_journal_head+0xf/0x3e

Comment 1 Dave Jones 2008-03-05 17:42:18 UTC
hmm, so you can repeat this on demand ?
If so, can you try the 2.6.24 based kernel from updates-testing ?

Adding Eric to Cc as maybe he's seen something similar in ext3 development.

Comment 2 Eric Sandeen 2008-03-05 18:00:18 UTC
no, I don't *think* I've seen this.  It looks vaguely like bug #428329.  But
then, it's not a lot to go on.  Could you include the full oopses rather than
the heavily-edited versions?

Also, testing on a debug kernel (yum install kernel-debug) might yield clues.

Comment 3 JW 2008-03-05 22:14:36 UTC
I have gone back to kernel- because I have had this running on
other hardware for several months now. Pity that it has disappeared from nearly
every mirror though.  Why do good rpms in updates repository get deleted and
replaced by inferior ones (more patches != better patches)?

Comment 4 Eric Sandeen 2008-03-05 22:24:51 UTC
Hm, I'll be sure to suggest to the kernel maintainers that they discontinue
their irrational, single-minded quest for more and more patches.... 

But anyway, you can usually find older rpms on koji.fedoraproject.org, for
example http://koji.fedoraproject.org/packages/kernel/

If you'd like to see the problem resolved so that you don't have to stick with
fc7 kernels, posting the full oops output, or reproducing the problem on the
debug kernel as I suggested would be a great help.  Otherwise I'm not sure we
can hope for a resolution, with the limited information provided.


Comment 5 Chuck Ebbert 2008-03-05 23:46:29 UTC
Please post the complete oops messages.

Comment 6 JW 2008-03-07 02:27:03 UTC
I have gone back to FC7 kernel (kernel- which is nice and stable.
No problems whatsoever over last couple of days (FC8 kernel crashed at least
once every day).

Cannot vouch for current FC7 kernel update (kernel- because the
numbering has gone from to which doesn't make a lot of sense.

Somebody should make available again because it is good.

Comment 7 Eric Sandeen 2008-03-07 02:40:34 UTC
Without more information, we cannot proceed on this bug.

If you can provide the requested data (full oops output, preferably from a debug
kernel), please re-open.

Comment 8 JW 2008-03-07 03:09:06 UTC
If you are comfortable ignoring the stack traces that I have provided then go
ahead and close this bug and pretend that FC8 kernel is fantastic. That is your
choice, not mine.

It certainly is a clever way to eliminate kernel flaws.

Comment 9 Eric Sandeen 2008-03-07 03:37:04 UTC
It is clearly a bug, (well, or perhaps bad memory or whatnot, so far we really
cannot tell) and I'd very much like to fix it if possible.  I've not seen it
reported elsewhere, and you seem to be somewhat uniquely able to hit it. 
However, the information you have provided is not enough to go on.  "something
went wrong down this path" just isn't enough to go on.  Maybe it was a null
pointer.  Maybe it was a bad pointer.  All I can do is guess.

There is a reason that the kernel provides copious amounts of information on an
oops; it is so an attempt can be made at debugging.  You have provided perhaps
20% of the oops info, and seem to be unwilling or unable to run this one more
time to gather the information requested.

I have no testcase; I have no oops output.  I have a stacktrace, and that is
all.  I can't see registers, I don't know what modules were loaded, I don't know
if your kernel was tainted, etc etc.  I don't even know what the actual first
line of the oops was.  If you don't want me to close this bug, I need just a bit
more info from you, as the reporter.  I cannot magically infer the missing

I have a hunch that it may be related to another bug I've seen, which in turn
may be related to suspend problems.  But, there is simply not enough here to be
able to tell.

If you can provide the requested info by running the problematic kernel once
more, preferably the debug variant, I am more than willing to spend time digging
into this bug.

Note You need to log in before you can comment on or make changes to this bug.