Created attachment 436326 [details]
Description of problem:
kernel panic, see attachment
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. unknown, happened while shutting down kvm guest
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.
** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.
If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.
Created attachment 442499 [details]
Same panic on .70
Before the most recent oops, I see:
mapcount 2 page_mapcount 3
The VM I was running was using a patch, upstream qemu-kvm with the following options:
-enable-kvm -m 2000 -smp 2,sockets=2,cores=1,threads=1 -name rhel6vm -uuid 1e3f234d-338f-e438-1d43-14393856409c -nodefconfig -nodefaults -monitor stdio -rtc base=utc -boot c -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/lib/libvirt/images/VMs/rhel6vm.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -usb -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -netdev tap,script=/home/alwillia/bin/br0-ifup,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5f:78:73,bus=pci.0,addr=0x3,txtimer=1 -vnc :1 -no-kvm-irqchip -no-kvm-pit
Host system is a 12G, single socket Xeon W3520 (nehalem class)
I build a kernel that includes the fix for bug 627591 . I also added some debug code specific for this bugreport so if this happens again we'll know more of what's going on, but you'll have to include the output before the oops too (I'll be printed as 1 line before the page_mapcount %d mapcount %d line).
Please use this build until you can reproduce again. I've absolutely never seen this myself, and I only recall one report from CAI Qian in bug 622327 where one stack trace is identical to yours. I thought 622327 was bad hardware. It also worth comparing the CPU and systems you're using to be sure it's not the same cpu that leads to bad results considering THP is stressing bits of the CPU that normally wouldn't be stressed to this extent.
It probably can be explained as the slab RCU race condition fixed by Hugh. I'm not 100% sure though, so we need more testing with the above build to be sure. If it's that bug you need to stress the slab a lot.
I'll mark this as duplicate of 622327 because that bug also has another different stack trace within page_lock_anon_vma context (the very function patched by the fix of bug 627591).
*** This bug has been marked as a duplicate of bug 622327 ***
debug code failed build in some arch, submitted now build as in comment #8