620874 – kernel BUG at mm/huge_memory.c:1269!

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 620874 - kernel BUG at mm/huge_memory.c:1269!

Summary: kernel BUG at mm/huge_memory.c:1269!

Keywords:
Status:	CLOSED DUPLICATE of bug 622327
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	low
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Red Hat Kernel Manager
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-08-03 16:46 UTC by Alex Williamson
Modified:	2010-09-02 18:02 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-09-02 16:45:17 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
panic log (4.54 KB, text/plain) 2010-08-03 16:46 UTC, Alex Williamson	no flags	Details
panic (5.53 KB, text/plain) 2010-09-01 21:19 UTC, Alex Williamson	no flags	Details
View All

Description Alex Williamson 2010-08-03 16:46:31 UTC

Created attachment 436326 [details]
panic log

Description of problem:
kernel panic, see attachment

Version-Release number of selected component (if applicable):
kernel-2.6.32-54.el6.x86_64

How reproducible:
unknown

Steps to Reproduce:
1. unknown, happened while shutting down kvm guest
2.
3.
  
Actual results:
host panic

Expected results:
no panic

Additional info:

Comment 2 RHEL Program Management 2010-08-03 17:08:05 UTC

This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 RHEL Program Management 2010-08-18 21:25:40 UTC

Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.

Comment 4 Alex Williamson 2010-09-01 21:19:10 UTC

Created attachment 442499 [details]
panic

Same panic on .70

Comment 6 Alex Williamson 2010-09-01 21:35:39 UTC

Before the most recent oops, I see:

mapcount 2 page_mapcount 3

Comment 7 Alex Williamson 2010-09-01 22:47:08 UTC

The VM I was running was using a patch, upstream qemu-kvm with the following options:

-enable-kvm -m 2000 -smp 2,sockets=2,cores=1,threads=1 -name rhel6vm -uuid 1e3f234d-338f-e438-1d43-14393856409c -nodefconfig -nodefaults -monitor stdio -rtc base=utc -boot c -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/lib/libvirt/images/VMs/rhel6vm.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -usb -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -netdev tap,script=/home/alwillia/bin/br0-ifup,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:5f:78:73,bus=pci.0,addr=0x3,txtimer=1 -vnc :1 -no-kvm-irqchip -no-kvm-pit

Host system is a 12G, single socket Xeon W3520 (nehalem class)

Comment 8 Andrea Arcangeli 2010-09-02 16:44:49 UTC

Hello,

I build a kernel that includes the fix for bug 627591 . I also added some debug code specific for this bugreport so if this happens again we'll know more of what's going on, but you'll have to include the output before the oops too (I'll be printed as 1 line before the page_mapcount %d mapcount %d line).

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2729732

Please use this build until you can reproduce again. I've absolutely never seen this myself, and I only recall one report from CAI Qian in bug 622327 where one stack trace is identical to yours. I thought 622327 was bad hardware. It also worth comparing the CPU and systems you're using to be sure it's not the same cpu that leads to bad results considering THP is stressing bits of the CPU that normally wouldn't be stressed to this extent.

It probably can be explained as the slab RCU race condition fixed by Hugh. I'm not 100% sure though, so we need more testing with the above build to be sure. If it's that bug you need to stress the slab a lot.

I'll mark this as duplicate of 622327 because that bug also has another different stack trace within page_lock_anon_vma context (the very function patched by the fix of bug 627591).

Comment 9 Andrea Arcangeli 2010-09-02 16:45:17 UTC


*** This bug has been marked as a duplicate of bug 622327 ***

Comment 10 Andrea Arcangeli 2010-09-02 18:02:46 UTC

debug code failed build in some arch, submitted now build as in comment #8

https://brewweb.devel.redhat.com/taskinfo?taskID=2729988

Note You need to log in before you can comment on or make changes to this bug.