Red Hat Bugzilla – Bug 144869
kernel panic in raid1_end_write_request
Last modified: 2015-01-04 17:15:20 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.5) Gecko/20041111 Firefox/1.0
Description of problem:
I have a DELL PE 2650, Dual Xeon, 1G memory and several software raid partitions. Main duties include NFS, DHCP and samba. No desktop.
This system ran FC1 all of last year without problems. Has just been upgraded to FC3.
I had a similar panic recently with kernel-smp-2.6.9-1.681_FC3 but did not have a serial console setup to capture the panic message.
Version-Release number of selected component (if applicable):
Created attachment 109656 [details]
Created attachment 109657 [details]
Created attachment 109736 [details]
A few days latter, another panic
Panic occurred at 3:42AM while nothing much was happening. The panic message is
very similar, diff below.
Any ideas of what else I could do? Enable some sort of debugging maybe?
Here is the diff between the last panic and this panic:
diff panic1 panic2
< Unable to handle kernel NULL pointer dereference at virtual address 00000038
> Unable to handle kernel paging request at virtual address 00010037
< *pde = 3746e001
> *pde = 36933001
< CPU: 1
> CPU: 3
< eax: 00000000 ebx: f7992400 ecx: f7974220 edx: 00000000
< esi: 00000018 edi: f7978980 ebp: f7992400 esp: c03abf18
> eax: 0000ffff ebx: f7a38600 ecx: f78ef540 edx: 00000000
> esi: 00000018 edi: f7a46e80 ebp: f7a38600 esp: c03adf18
< Process swapper (pid: 0, threadinfo=c03ab000 task=f7f58530)
< Stack: f1103f00 00001000 f8829381 00000000 c015643b 00001000 f1103f00
< c03abf60 c0217acf f74f37d4 00000000 00000000 00000000 00001000
< f7d5002c f7dcfe00 00000001 f88435ec 00000001 f7941680 f74f37d4
> Process swapper (pid: 0, threadinfo=c03ad000 task=f7f5fa40)
> Stack: f3364180 00001000 f8829381 00000000 c015643b 00001000 f3364180
> c03adf60 c0217acf f6d2c8fc 00000000 00000000 00000000 00003000
> f7d4c33c f7dc8e00 00000001 f88435ec 00000001 f7921080 f6d2c8fc
I've found a slab-corruption bug today, and will be pushing out an update soon.
Your problem(s) could just be caused by that in-memory corruption.
Now running 2.6.10-1.741_FC3smp
No news is good news.
Created attachment 109797 [details]
panic with 2.6.10-1.741_FC3smp
Bad news. Another panic. Still in raid1_end_write_request. This time with the
new 2.6.10-1.741_FC3smp kernel.
Created attachment 109837 [details]
Another panic with 2.6.10-1.741_FC3smp
I guess there are enough of these panic messages posted here now.
Just noticed 2.6.10-ac10 with:
* Fix bio free before reuse case for clones (Jens Axboe)
| Fixes assorted raid oops/crashes
I wonder if that will help?
Trying kernel-smp-2.6.10-1.747_FC3 from
This has 2.6.10-ac10 according to the changelog.
No panic this time, just a lock-up. Nothing on the console(s), sysrq key would
not do anything (it is enabled). Caps lock etc keys all off and not coming back
on. Not network pingable.
Nothing in syslog messages.
This system has an NMI button. Is it worth enabling that? nmi_watchdog=1 on the
command line is the only way to do that right?
Oh yes, this is kernel 2.6.10-1.747_FC3smp
kernel 2.6.10-1.747_FC3smp has been running with numerous panics and
lockups. Nothing consistent. I have started to think hardware problems
but the dell diagnostics picked up nothing.
Thought I had a memory problem when I ran memtest86 from the FC3
rescue disk but I discovered I had to turn off USB BIOS and then
memtest86 ran OK.
Running 2.6.10-1.747_FC3smp now with USB BIOS disabled. Could that
After disabling USB in the BIOS, 2.6.10-1.747_FC3smp has been running without
problem for 2.5 days now.
So was this problem a buggy BIOS crashing the kernel? Or have I just moved
something around in memory and the problem is now hidden? I guess I will never know.
Anyway, I'm becoming happier as each hour of uptime passes. Since I seem to be
the only person in the universe with this problem, I'll close this bug as
NOTABUG after a few more days of uptime.
I spoke too soon. Another panic in raid1_end_write_request this morning.
It's been a stable few months now. I've be running FC3 kernels now with acpi=off
and it's been rock solid. I have no hypertheading but this system does not have
a huge workload and it's doing it's job.
I'm the only one that seems to have this problem. Other people with similar
systems tend to have H/W raid. They seem to have a whole new set of stability
problems with 2.6 kernel. The problems (mostly) seem to go away when you update
to the latest Dell firmware/BIOSs.
So I guess I'll put the bug down to buggy Dell BIOSs even though I have not
tested this. This has been a theme in some other Dell systems I have as well.
I'll close this bug as NOTABUG and do my bit to tidy up bugzilla.