Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 724995

Summary:

xen mmu: fix a race window causing leave_mm BUG()

Product:

Red Hat Enterprise Linux 6

Reporter:

Radim Krčmář <rkrcmar>

Component:

kernel

Assignee:

Radim Krčmář <rkrcmar>

Status:

CLOSED ERRATA

QA Contact:

Virtualization Bugs <virt-bugs>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

6.2

CC:

leiwang, qguan, qwan

Target Milestone:

Target Release:

---

Hardware:

Unspecified

OS:

Linux

Whiteboard:

Fixed In Version:

kernel-2.6.32-176.el6

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2011-12-06 13:53:40 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
xen mmu: fix a race window causing leave_mm BUG()	none

Description Radim Krčmář 2011-07-22 14:30:49 UTC

Created attachment 514712 [details]
xen mmu: fix a race window causing leave_mm BUG()

Cherry-pick 7899891c7d161752f29abcc9bc0a9c6c3a3af26c from upstream.
Original commit message:

    xen mmu: fix a race window causing leave_mm BUG()
    
    There's a race window in xen_drop_mm_ref, where remote cpu may exit
    dirty bitmap between the check on this cpu and the point where remote
    cpu handles drop request. So in drop_other_mm_ref we need check
    whether TLB state is still lazy before calling into leave_mm. This
    bug is rarely observed in earlier kernel, but exaggerated by the
    commit 831d52bc153971b70e64eccfbed2b232394f22f8
    ("x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm")
    which clears bitmap after changing the TLB state. the call trace is as below:
    
    ---------------------------------
    kernel BUG at arch/x86/mm/tlb.c:61!
    invalid opcode: 0000 [#1] SMP
    last sysfs file: /sys/devices/system/xen_memory/xen_memory0/info/current_kb
    CPU 1
    Modules linked in: 8021q garp xen_netback xen_blkback blktap blkback_pagemap nbd bridge stp llc autofs4 ipmi_devintf ipmi_si ipmi_msghandler lockd sunrpc bonding ipv6 xenfs dm_multipath video output sbs sb
    Pid: 25581, comm: khelper Not tainted 2.6.32.36fixxen #1 Tecal RH2285
    RIP: e030:[<ffffffff8103a3cb>]  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
    RSP: e02b:ffff88002805be48  EFLAGS: 00010046
    RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88015f8e2da0
    RDX: ffff88002805be78 RSI: 0000000000000000 RDI: 0000000000000001
    RBP: ffff88002805be48 R08: ffff88009d662000 R09: dead000000200200
    R10: dead000000100100 R11: ffffffff814472b2 R12: ffff88009bfc1880
    R13: ffff880028063020 R14: 00000000000004f6 R15: 0000000000000000
    FS:  00007f62362d66e0(0000) GS:ffff880028058000(0000) knlGS:0000000000000000
    CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000003aabc11909 CR3: 000000009b8ca000 CR4: 0000000000002660
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000000 00
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process khelper (pid: 25581, threadinfo ffff88007691e000, task ffff88009b92db40)
    Stack:
     ffff88002805be68 ffffffff8100e4ae 0000000000000001 ffff88009d733b88
    <0> ffff88002805be98 ffffffff81087224 ffff88002805be78 ffff88002805be78
    <0> ffff88015f808360 00000000000004f6 ffff88002805bea8 ffffffff81010108
    Call Trace:
     <IRQ>
     [<ffffffff8100e4ae>] drop_other_mm_ref+0x2a/0x53
     [<ffffffff81087224>] generic_smp_call_function_single_interrupt+0xd8/0xfc
     [<ffffffff81010108>] xen_call_function_single_interrupt+0x13/0x28
     [<ffffffff810a936a>] handle_IRQ_event+0x66/0x120
     [<ffffffff810aac5b>] handle_percpu_irq+0x41/0x6e
     [<ffffffff8128c1c0>] __xen_evtchn_do_upcall+0x1ab/0x27d
     [<ffffffff8128dd11>] xen_evtchn_do_upcall+0x33/0x46
     [<ffffffff81013efe>] xen_do_hyper visor_callback+0x1e/0x30
     <EOI>
     [<ffffffff814472b2>] ? _spin_unlock_irqrestore+0x15/0x17
     [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
     [<ffffffff81113f71>] ? flush_old_exec+0x3ac/0x500
     [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
     [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
     [<ffffffff8115115d>] ? load_elf_binary+0x398/0x17ef
     [<ffffffff81042fcf>] ? need_resched+0x23/0x2d
     [<ffffffff811f4648>] ? process_measurement+0xc0/0xd7
     [<ffffffff81150dc5>] ? load_elf_binary+0x0/0x17ef
     [<ffffffff81113094>] ? search_binary_handler+0xc8/0x255
     [<ffffffff81114362>] ? do_execve+0x1c3/0x29e
     [<ffffffff8101155d>] ? sys_execve+0x43/0x5d
     [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
     [<ffffffff81013e28>] ? kernel_execve+0x68/0xd0
     [<ffffffff 8106fc45>] ? __call_usermodehelper+0x0/0x6f
     [<ffffffff8100f8cf>] ? xen_restore_fl_direct_end+0x0/0x1
     [<ffffffff8106fb64>] ? ____call_usermodehelper+0x113/0x11e
     [<ffffffff81013daa>] ? child_rip+0xa/0x20
     [<ffffffff8106fc45>] ? __call_usermodehelper+0x0/0x6f
     [<ffffffff81012f91>] ? int_ret_from_sys_call+0x7/0x1b
     [<ffffffff8101371d>] ? retint_restore_args+0x5/0x6
     [<ffffffff81013da0>] ? child_rip+0x0/0x20
    Code: 41 5e 41 5f c9 c3 55 48 89 e5 0f 1f 44 00 00 e8 17 ff ff ff c9 c3 55 48 89 e5 0f 1f 44 00 00 65 8b 04 25 c8 55 01 00 ff c8 75 04 <0f> 0b eb fe 65 48 8b 34 25 c0 55 01 00 48 81 c6 b8 02 00 00 e8
    RIP  [<ffffffff8103a3cb>] leave_mm+0x15/0x46
     RSP <ffff88002805be48>
    ---[ end trace ce9cee6832a9c503 ]---

    Tested-by: Maoxiaoyun<tinnycloud>
    Signed-off-by: Kevin Tian <kevin.tian>
    [v1: Fleshed out the git description a bit]
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk>

Comment 2 Qixiang Wan 2011-07-28 11:53:35 UTC

Hi Radim, do you have any instructions for QE to reproduce and verify this bug? or only SanityCheck is enough?

Comment 4 Aristeu Rozanski 2011-08-02 13:58:11 UTC

Patch(es) available on kernel-2.6.32-176.el6

Comment 9 errata-xmlrpc 2011-12-06 13:53:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html