Bug 620874 - kernel BUG at mm/huge_memory.c:1269!
kernel BUG at mm/huge_memory.c:1269!
Status: CLOSED DUPLICATE of bug 622327
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.0
All Linux
low Severity high
: rc
: ---
Assigned To: Red Hat Kernel Manager
Red Hat Kernel QE team
: RHELNAK
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-03 12:46 EDT by Alex Williamson
Modified: 2010-09-02 14:02 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-09-02 12:45:17 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
panic log (4.54 KB, text/plain)
2010-08-03 12:46 EDT, Alex Williamson
no flags Details
panic (5.53 KB, text/plain)
2010-09-01 17:19 EDT, Alex Williamson
no flags Details

  None (edit)
Description Alex Williamson 2010-08-03 12:46:31 EDT
Created attachment 436326 [details]
panic log

Description of problem:
kernel panic, see attachment

Version-Release number of selected component (if applicable):
kernel-2.6.32-54.el6.x86_64

How reproducible:
unknown

Steps to Reproduce:
1. unknown, happened while shutting down kvm guest
2.
3.
  
Actual results:
host panic

Expected results:
no panic

Additional info:
Comment 2 RHEL Product and Program Management 2010-08-03 13:08:05 EDT
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **
Comment 3 RHEL Product and Program Management 2010-08-18 17:25:40 EDT
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.
Comment 4 Alex Williamson 2010-09-01 17:19:10 EDT
Created attachment 442499 [details]
panic

Same panic on .70
Comment 6 Alex Williamson 2010-09-01 17:35:39 EDT
Before the most recent oops, I see:

mapcount 2 page_mapcount 3
Comment 7 Alex Williamson 2010-09-01 18:47:08 EDT
The VM I was running was using a patch, upstream qemu-kvm with the following options:

-enable-kvm -m 2000 -smp 2,sockets=2,cores=1,threads=1 -name rhel6vm -uuid 1e3f234d-338f-e438-1d43-14393856409c -nodefconfig -nodefaults -monitor stdio -rtc base=utc -boot c -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/lib/libvirt/images/VMs/rhel6vm.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -usb -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -netdev tap,script=/home/alwillia/bin/br0-ifup,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5f:78:73,bus=pci.0,addr=0x3,txtimer=1 -vnc :1 -no-kvm-irqchip -no-kvm-pit

Host system is a 12G, single socket Xeon W3520 (nehalem class)
Comment 8 Andrea Arcangeli 2010-09-02 12:44:49 EDT
Hello,

I build a kernel that includes the fix for bug 627591 . I also added some debug code specific for this bugreport so if this happens again we'll know more of what's going on, but you'll have to include the output before the oops too (I'll be printed as 1 line before the page_mapcount %d mapcount %d line).

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2729732

Please use this build until you can reproduce again. I've absolutely never seen this myself, and I only recall one report from CAI Qian in bug 622327 where one stack trace is identical to yours. I thought 622327 was bad hardware. It also worth comparing the CPU and systems you're using to be sure it's not the same cpu that leads to bad results considering THP is stressing bits of the CPU that normally wouldn't be stressed to this extent.

It probably can be explained as the slab RCU race condition fixed by Hugh. I'm not 100% sure though, so we need more testing with the above build to be sure. If it's that bug you need to stress the slab a lot.

I'll mark this as duplicate of 622327 because that bug also has another different stack trace within page_lock_anon_vma context (the very function patched by the fix of bug 627591).
Comment 9 Andrea Arcangeli 2010-09-02 12:45:17 EDT

*** This bug has been marked as a duplicate of bug 622327 ***
Comment 10 Andrea Arcangeli 2010-09-02 14:02:46 EDT
debug code failed build in some arch, submitted now build as in comment #8

https://brewweb.devel.redhat.com/taskinfo?taskID=2729988

Note You need to log in before you can comment on or make changes to this bug.