RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 646384 - kernel BUG at mm/migrate.c:113!
Summary: kernel BUG at mm/migrate.c:113!
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: rc
: ---
Assignee: Andrea Arcangeli
QA Contact: Caspar Zhang
URL:
Whiteboard:
Depends On: Rhel6KvmTier1
Blocks: 647391
TreeView+ depends on / blocked
 
Reported: 2010-10-25 09:42 UTC by Qian Cai
Modified: 2013-07-03 07:27 UTC (History)
8 users (show)

Fixed In Version: kernel-2.6.32-81.el6
Doc Type: Bug Fix
Doc Text:
Running certain workload tests on a Non-Uniform Memory Architecture (NUMA) system could cause kernel panic at mm/migrate.c:113. This was due to a false positive BUG_ON. With this update, the false positive BUG_ON has been removed.
Clone Of:
Environment:
Last Closed: 2011-05-19 12:01:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0542 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update 2011-05-19 11:58:07 UTC

Description Qian Cai 2010-10-25 09:42:48 UTC
Description of problem:
kernel BUG at mm/migrate.c:113!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu63/cache/index2/shared_cpu_map
CPU 0 
Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter ip_tables bridge stp llc kvm_intel kvm autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: microcode]

Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT iptable_filter ip_tables bridge stp llc kvm_intel kvm autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table ipv6 dm_mirror dm_region_hash dm_log i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: microcode]
Pid: 28103, comm: largepages15 Tainted: G        W  ----------------  2.6.32-76.el6.test.x86_64 #1 QSSC-S4R
RIP: 0010:[<ffffffff8115b4ea>]  [<ffffffff8115b4ea>] remove_migration_pte+0x20a/0x2f0
RSP: 0000:ffff88105d7c99a8  EFLAGS: 00010246
RAX: 8000000937e000e5 RBX: ffff880c6cc2cdc0 RCX: ffffea000732f1e0
RDX: ffff880bf61d9000 RSI: ffff8809021d2d40 RDI: 0000000000000000
RBP: ffff88105d7c9a08 R08: 00003ffffffff000 R09: ffff880000000000
R10: ffffc00000000fff R11: ffff880894b0edf0 R12: 00007ffff7ce4000
R13: ffff8809021d2d40 R14: ffffea000f717138 R15: ffffffff8115b2e0
FS:  00007ffff7ff1700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ffff7df1000 CR3: 000000086bc54000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process largepages15 (pid: 28103, threadinfo ffff88105d7c8000, task ffff88105b2c2080)
Stack:
 0000000000000000 00003ffffffff000 ffff88046c6891b8 ffffc00000000fff
<0> ffffea000000001e ffffea000f5075c0 800000020e8e4045 ffff880c6c8c52b8
<0> ffffea000f717138 ffffea000732f1e0 ffff880c6c8cd4d8 ffffffff8115b2e0
Call Trace:
 [<ffffffff8115b2e0>] ? remove_migration_pte+0x0/0x2f0
 [<ffffffff8113e8fe>] rmap_walk+0x16e/0x1c0
 [<ffffffff8115b892>] ? migrate_page_copy+0x102/0x1c0
 [<ffffffff8115c08d>] migrate_pages+0x48d/0x5d0
 [<ffffffff81152750>] ? compaction_alloc+0x0/0x370
 [<ffffffff811521ac>] compact_zone+0x4ec/0x630
 [<ffffffff81152591>] compact_zone_order+0xa1/0xe0
 [<ffffffff811526db>] try_to_compact_pages+0x10b/0x180
 [<ffffffff8111e6cc>] __alloc_pages_nodemask+0x55c/0x810
 [<ffffffff811505f4>] alloc_pages_vma+0x84/0x110
 [<ffffffff8113f1c0>] ? anon_vma_prepare+0x30/0x160
 [<ffffffff81167995>] do_huge_pmd_anonymous_page+0x135/0x340
 [<ffffffff811365b5>] handle_mm_fault+0x245/0x2b0
 [<ffffffff814cd8d3>] do_page_fault+0x123/0x3a0
 [<ffffffff814cb345>] page_fault+0x25/0x30
Code: 48 09 c6 48 89 f2 48 c1 ea 3b 83 fa 1e 74 24 83 fa 1f 74 1f 48 8b 45 c8 66 ff 00 66 66 90 e9 06 ff ff ff 0f 0b eb fe 0f 0b eb fe <0f> 0b 0f 1f 40 00 eb fa 48 b8 ff ff ff ff ff ff ff 07 48 21 c6 
RIP  [<ffffffff8115b4ea>] remove_migration_pte+0x20a/0x2f0
 RSP <ffff88105d7c99a8>

Version-Release number of selected component (if applicable):
kernel from RHBZ#622327#c81.

How reproducible:
unknown

Steps to Reproduce:
1. prepare a NUMA system (reproduced on a Nehalem-EX system).
2. threade_memtest+oom+kernelbuild+kvm workloads.
3. reproducer from RHBZ#642570 and modify largepages15.c to use KSM.
# for i in `seq 1 60`; do ./largepages15 & done
  
Actual results:
panic

Expected results:
No panic.

Additional info:
Unfortunately, kdump did not work in this case so no vmcore captured.

Comment 3 Andrea Arcangeli 2010-10-25 17:39:23 UTC
Fix posted to rhkernel-list with Message-ID: <20101025173439.GM910>

I removed the false positive BUG_ON and introduced one new VM_BUG_ON in a s/!pmd_present/pmd_none/ related place, the VM_BUG_ON introduced will be converted to BUG_ON to exercise it in the build that I will provide to QA.

The build system I use has disk full problem, as soon as it's fixed I'll provide a build with patch included. Thanks!

Comment 4 Andrea Arcangeli 2010-10-25 18:11:58 UTC
Build with fix in comment #3 included (with VM_BUG_ON converted to BUG_ON) here:

http://brewweb.devel.redhat.com/brew/taskinfo?taskID=2850419

Comment 5 RHEL Program Management 2010-10-26 10:49:26 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 12 Aristeu Rozanski 2010-11-12 19:14:19 UTC
Patch(es) available on kernel-2.6.32-82.el6

Comment 18 Martin Prpič 2011-05-09 12:21:56 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Running certain workload tests on a Non-Uniform Memory Architecture (NUMA) system could cause kernel panic at mm/migrate.c:113. This was due to a false positive BUG_ON. With this update, the false positive BUG_ON has been removed.

Comment 19 errata-xmlrpc 2011-05-19 12:01:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html


Note You need to log in before you can comment on or make changes to this bug.