Red Hat Bugzilla – Bug 452485
kernel soft lockup in ext3 (22.214.171.124-92.fc8)
Last modified: 2009-01-09 01:37:53 EST
Description of problem:
The kernel crashed due to very intensive disk usage on a HP-Proliant-DL320s
server fully populated with 12x750GB SATA 3.5" hard disks. The sever became
unresponsive, and kernel crash were displayed in the serial console. These dumps
are attached to this bug. After reboot, fsck was forced:
/dev/VolGroup00/data contains a file system with errors, check forced.
Version-Release number of selected component (if applicable):
It is a Fedora-8 amd64 system installed recently (may 2008) and with all
packages updated (yum upgrade -y). The running kernel is 126.96.36.199-92.fc8.
Install Fedora8 amd64 running 188.8.131.52-92.fc8, make a very large ext3
filesystem, and run a lot of disk intensive programs: millions of small files,
and several very large files as well.
The hard drive seen by the OS is a very large volume (about 8TB hardware RAID5
array). It was formatted with several small partitions (root, boot, swap) and
one very large ext3 volume (7.5TB = /dev/VolGroup00/data).
It's a backup server, it was running several processes with a very intensive
disk usage. The server was doing the following:
- removing very large files with rm
- writing files from the network (ncftp)
- running rsync clients using both local directories and remote rsync servers
Created attachment 310020 [details]
kernel crash in ext3 related code
Technically a lockup not a crash...
BUG: soft lockup - CPU#1 stuck for 11s! [rm:7528]
When/if you hit this again can you do a sysrq-t or sysrq-w and attach that output?
And if you have the facilities to capture a coredump I'd love to have that as well.
Thanks for your quick reply.
Before I reboot, I waited about 10 minutes, and the server was unresponsive, so
maybe it's not a crash, but I am sure the problem lasted a long time, it was
still not reactive after 10 minutes.
I tried to generate a kernel core dump, but unfortunately the magic keys had no
effect. And I have no physical access to the machine, I was just connected
through the serial console. Can sysrq work with the serial console ?
Is there a way to prepare the server so that it saves a kernel dump
automatically without allocating memory for kdump (crashkernel=128M@16M) ?
make sure you have kernel.sysrq = 1 in your /etc/sysctl.conf, if you don't put
it in there and run sysctl -p to make it update without rebooting.
Unfortunately there is no way to setup kdump without allocating memory for it at
bootup. sysrq will definitely work over serial console.
This message is a reminder that Fedora 8 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 8. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '8'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 8's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 8 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 8 changed to end-of-life (EOL) status on 2009-01-07. Fedora 8 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.