RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 671161 - xen microcode WARN on save-restore
Summary: xen microcode WARN on save-restore
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: rc
: ---
Assignee: Andrew Jones
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 710632
TreeView+ depends on / blocked
 
Reported: 2011-01-20 16:26 UTC by Andrew Jones
Modified: 2011-08-18 14:41 UTC (History)
6 users (show)

Fixed In Version: kernel-2.6.32-117.el6
Doc Type: Bug Fix
Doc Text:
If the microcode module was loaded, saving and restoring a Xen guest returned a warning message and a backtrace error. With this update, backtrace errors are no longer returned, and saving and restoring a Xen guest works as expected.
Clone Of:
Environment:
Last Closed: 2011-05-23 20:38:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0542 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 6.1 kernel security, bug fix and enhancement update 2011-05-19 11:58:07 UTC

Description Andrew Jones 2011-01-20 16:26:42 UTC
When save/restoring a xen guest the guest spews the following warning to the console/dmesg on every restore if the microcode module was loaded

------------[ cut here ]------------
WARNING: at arch/x86/kernel/microcode_core.c:451 mc_sysdev_resume+0x67/0x70 [microcode]() (Not tainted)
Modules linked in: sunrpc ipv6 xt_physdev iptable_filter ip_tables dm_mirror dm_region_hash dm_log microcode xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mod [last unloaded: scsi_wait_scan]
Pid: 5, comm: migration/0 Not tainted 2.6.32-99.el6.x86_64 #1
Call Trace:
 [<ffffffff81063917>] warn_slowpath_common+0x87/0xc0
 [<ffffffff8106396a>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa009e0c7>] mc_sysdev_resume+0x67/0x70 [microcode]
 [<ffffffff8132961e>] __sysdev_resume+0x4e/0xe0
 [<ffffffff81329739>] sysdev_resume+0x89/0x190
 [<ffffffff812e4e82>] xen_suspend+0x92/0xf0
 [<ffffffff810be35b>] stop_machine_cpu_stop+0x9b/0xe0
 [<ffffffff810be2c0>] ? stop_machine_cpu_stop+0x0/0xe0
 [<ffffffff810be1ea>] cpu_stopper_thread+0xda/0x1b0
 [<ffffffff814c6166>] ? thread_return+0x4e/0x778
 [<ffffffff8100733d>] ? xen_force_evtchn_callback+0xd/0x10
 [<ffffffff81007b62>] ? check_events+0x12/0x20
 [<ffffffff81007b4f>] ? xen_restore_fl_direct_end+0x0/0x1
 [<ffffffff810be110>] ? cpu_stopper_thread+0x0/0x1b0
 [<ffffffff81089a76>] kthread+0x96/0xa0
 [<ffffffff8100c1ca>] child_rip+0xa/0x20
 [<ffffffff8100b393>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff8100bb1d>] ? retint_restore_args+0x5/0x6
 [<ffffffff8100c1c0>] ? child_rip+0x0/0x20
---[ end trace a27ac8656b6a708e ]---

The code that produces this warning (below) has been around forever. However, we haven't seen the problem before because the module was never loaded. Recent changes to the microcode_ctl package

# rpm -q --changelog microcode_ctl
* Wed Nov 24 2010 Anton Arapov <anton> - 1:1.17-4
- Update to microcode-20101123.dat
- Make microcode_ctl event driven
- Resolves: rhbz#578107
...

has started loading the module automatically on some platforms. This occurs also with upstream kernel code, so there isn't currently a fix currently available. To fix it we either need to find a way in the kernel to satisfy the WARN_ON, or, to at least fix it in RHEL we just need to ensure that the microcode module doesn't get automatically loaded on xen platforms.


436 static int mc_sysdev_resume(struct sys_device *dev)
437 {
438         int cpu = dev->id;
439         struct ucode_cpu_info *uci = ucode_cpu_info + cpu;
440 
441         if (!cpu_online(cpu))
442                 return 0;
443 
444         /*
445          * All non-bootup cpus are still disabled,
446          * so only CPU 0 will apply ucode here.
447          *
448          * Moreover, there can be no concurrent
449          * updates from any other places at this point.
450          */
451         WARN_ON(cpu != 0);

Comment 3 RHEL Program Management 2011-02-01 05:45:06 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.

Comment 4 RHEL Program Management 2011-02-01 19:08:49 UTC
This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 5 Andrew Jones 2011-02-03 16:01:15 UTC
I haven't had time to try and come up with a clean way to handle this when running on Xen for 6.1. Can we just blacklist this module when on Xen somehow?

Comment 6 Anton Arapov 2011-02-04 07:37:54 UTC
Andrew, seems someone already tried to fix it:
  http://marc.info/?l=linux-kernel&m=126105863415715&w=2
could you try the patch also, whether it fixes the case? I will push into upstream.

thanks,

Comment 7 Anton Arapov 2011-02-04 09:21:43 UTC
Andrew, here the kernels with the patch integrated:
   http://people.redhat.com/aarapov/kernel/

Comment 8 Anton Arapov 2011-02-09 07:23:48 UTC
Patch is in mm-tree.
  http://marc.info/?l=linux-mm-commits&m=129720269928412&w=2

 - would you backport it?
 - is this bug a blocker?

Comment 9 Andrew Jones 2011-02-09 16:51:14 UTC
Yes, I need to backport it now for 6.1 since it's pretty ugly to get a big backtrace on every save/restore. I couldn't test it until now as save/restore had other issues (bug 676009). Other issues resolved I've now tested this patch and will post it today.

Comment 10 RHEL Program Management 2011-02-09 17:11:14 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 11 Aristeu Rozanski 2011-02-18 22:31:32 UTC
Patch(es) available on kernel-2.6.32-117.el6

Comment 17 Igor Mammedov 2011-04-22 10:23:21 UTC
RHEL6.0 kernel 2.6.32-71.24.1 is affected by this WARNING too.
Moreover if microcode module is loaded and have a valid
microcode blob, then PV guest, with VCPUs > 1, on restore
will catch BUG_ON (raw_smp_processor_id() != cpu) in a vendor
specific apply_microcode func and crash like this:

------------[ cut here ]------------
kernel BUG at arch/x86/kernel/microcode_amd.c:142!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/platform/microcode/firmware/microcode/loading
CPU 0 
Modules linked in: microcode(U) ipv6 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Modules linked in: microcode(U) ipv6 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 571, comm: kstop/0 Tainted: G        W  ----------------  2.6.32-71.24.1.el6.x86_64nxen #24 
RIP: e030:[<ffffffffa0080062>]  [<ffffffffa0080062>] apply_microcode_amd+0xb2/0xc0 [microcode]
RSP: e02b:ffff88007be03d40  EFLAGS: 00010097
RAX: 0000000000000000 RBX: ffffc90000322000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff81cf0018 RDI: 0000000000000001
RBP: ffff88007be03d70 R08: 0000000000000000 R09: ffffffff8156f780
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000004000000000 R15: ffff8800021bdec8
FS:  00007ff285a427a0(0000) GS:ffff88000219f000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000007c7b4000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process kstop/0 (pid: 571, threadinfo ffff88007be02000, task ffff88007b008080)
Stack:
 ffffffff81810848 ffff8800021cab28 0000004000000000 0000000000000001
<0> ffffffff81810848 ffff8800021cab28 ffff88007be03d90 ffffffffa007f0ad
<0> 000000000000000b ffffffffa00817c0 ffff88007be03dc0 ffffffff8139be2e
Call Trace:
 [<ffffffffa007f0ad>] mc_sysdev_resume+0x4d/0x70 [microcode]
 [<ffffffff8139be2e>] __sysdev_resume+0x4e/0xe0
 [<ffffffff8139bf49>] sysdev_resume+0x89/0x190
 [<ffffffff810c6260>] ? stop_cpu+0x0/0xf0
 [<ffffffff81355df2>] xen_suspend+0x92/0xf0
 [<ffffffff810c6309>] stop_cpu+0xa9/0xf0
 [<ffffffff8108c780>] worker_thread+0x170/0x2a0
 [<ffffffff8100f33d>] ? xen_force_evtchn_callback+0xd/0x10
 [<ffffffff8100fb62>] ? check_events+0x12/0x20
 [<ffffffff81091e50>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c610>] ? worker_thread+0x0/0x2a0
 [<ffffffff81091ae6>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff81013393>] ? int_ret_from_sys_call+0x7/0x1b
 [<ffffffff81013b1d>] ? retint_restore_args+0x5/0x6
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
Code: 8b 5d e8 4c 8b 65 f0 4c 8b 6d f8 c9 c3 0f 1f 40 00 89 da 44 89 e6 48 c7 c7 68 11 08 a0 31 c0 e8 09 ee 49 e1 b8 ff ff ff ff eb d4 <0f> 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 
RIP  [<ffffffffa0080062>] apply_microcode_amd+0xb2/0xc0 [microcode]
 RSP <ffff88007be03d40>
---[ end trace 7f34b47b4668bb2c ]---
Kernel panic - not syncing: Fatal exception

Comment 18 Andrew Jones 2011-04-22 11:27:16 UTC
comment 17 shows that this is a good candidate for 6.0.z. Adding zstream keyword.

Comment 21 errata-xmlrpc 2011-05-23 20:38:32 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html

Comment 23 Martin Prpič 2011-08-18 14:41:29 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
If the microcode module was loaded, saving and restoring a Xen guest
returned a warning message and a backtrace error. With this update,
backtrace errors are no longer returned, and saving and restoring a Xen
guest works as expected.


Note You need to log in before you can comment on or make changes to this bug.