Red Hat Bugzilla – Bug 561370
KVM guest crashed during a multi guest database run
Last modified: 2013-01-09 17:16:21 EST
Description of problem:
KVM guest crashes during multi guest run (running database workload). The host is AMD (Six-Core AMD Opteron(tm) Processor 8431).
Version-Release number of selected component (if applicable):
Host and guests running 2.6.18-186.el5
File system used for the testing - ext4 (e4fsprogs-1.41.9-3.el5)
Happened just once. Do not know if this can be reproduced.
Steps to Reproduce:
1. Started 4 KVM guests (6 cpus - 14G each)
2. Ran database workload
3. One of the guests crashed.
Message in /var/log/messages in the guest at the time of the crash.
Feb 2 18:02:48 dhcp47-99 kernel: list_add corruption. prev->next should be ffff81038f7d3e28, but was 0000000000497000
Feb 2 18:02:48 dhcp47-99 kernel: ----------- [cut here ] --------- [please bite here ] ---------
Feb 2 18:02:48 dhcp47-99 kernel: Kernel BUG at lib/list_debug.c:31
Feb 3 08:37:19 dhcp47-99 syslogd 1.4.1: restart.
The guest should continue to run.
The screen shot of the console is attached.
Can you please attempt to reproduce the bug, and save the entire oops message (also there's no screenshot attached?).
Will look for possible candidates in the meantime. Sorry for the late reply.
Also, were hugepages being used?
I will try to reproduce the problem when I get a chance. But I am not sure that this issue is reproducible. That's why I captured everything that was reported hoping that it might give some clues.
Also there is an issue with ext4 running oracle databases. (BZ 562219). I am not sure if the two are related.
This test was not using huge pages.
Postponing to RHEL 5.6.
Closing the bug on the grounds its a one time memory corruption report, there's not much that can be done without a reproducible case.
Please reopen if necessary.