Red Hat Bugzilla – Bug 88900
kjournald crashes regulary, system hangs
Last modified: 2007-04-18 12:53:03 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.75 [en] (X11; U; SunOS 5.7 sun4u)
Description of problem:
1. Installed RedHat Linux 8.0 Professional, as a server running PostgreSQL with
2. kjournald crashes regulary, approx 4 times a day
3. Support informed me of a patch/update, installed via up2date
4. kjournald still crashes, new type of crash with kernel panic message:
<0> Kernel panic: Aiee, killing interrupt handler
In interrupt handler - not syncing
Support Service Request 229854
(I am new to this forum, Basic Installation Support urged me to submit a bug
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Just happens frequently without any obvious cause. I just wait a while.
Actual Results: kernel panic or kjournald stack trace appears on the console
Expected Results: A stable server for my small database java application would
have been nice :)
We need the actual console errors, including the stack trace and any preceding
diagnostics, to debug this.
Created attachment 91225 [details]
gzip of /var/log/message.2
The log has both the startup-sequence and some of the crashes, but not all in
the time span covered. I think there are crashes that don't end up in the log.
Created attachment 91310 [details]
output from df -k, df -i and lsmod in a textfile
Not a single one of the oopses in that log are in kjournald, nor do they show
any signs of being inside ext3. They show all the hallmarks of hardware memory
You really need to do a hardware memory test next. www.memtest86.com is the
best place to start.
You are right! Thanks a lot!
After running the tests for a day one of the memory modules was signalled as
faulty. After removing the module the machine has been up and running for 4 days
with no faults. Standard memory tests didn't detect anything. kjournald was
blamed because it was involved in the first n of the panics.
A humble tip: Why not inform RH support service of some generic error types, if
they had told me this from start, since they had the same information, you and I
would have saved some days of work.
Notes, but memory errors can show up as problems _anywhere_ in the kernel, and
it's a little tricky to give exact footprints of what stands out as such a problem.