Bug 707966
Summary: | 2.6.18-238.1.1.el5 or newer won't boot under Xen HVM due to linux-2.6-virt-nmi-don-t-print-nmi-stuck-messages-on-guests.patch | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Jan Kundrát <jkt> |
Component: | kernel-xen | Assignee: | Laszlo Ersek <lersek> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | urgent | ||
Version: | 5.5 | CC: | ajb, cww, dhoward, drjones, dzickus, honza801, jzheng, leiwang, mrezanin, pbonzini, pcao, qguan, qwan, sforsber, xen-maint |
Target Milestone: | rc | Keywords: | Regression, ZStream |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.6.18-285.el5 | Doc Type: | Bug Fix |
Doc Text: |
A previously applied patch to help clean-up a failed nmi_watchdog check by disabling various registers caused single-vcpu Xen HVM guests to become unresponsive during boot when the host CPU was an Intel Xeon Processor E5405 or an Intel Xeon Processor E5420, and the VM configuration did not have the apic = 1 parameter set. With this update, NMI_NONE is the default watchdog on AMD64 HVM guests, thus, fixing this issue.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-02-21 03:48:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 514489, 739823 | ||
Attachments: |
Description
Jan Kundrát
2011-05-26 12:48:39 UTC
Hi Jan, assuming there wasn't some strange issue the led the bisection to this patch incorrectly, then the only thing it could be is the apic_write. I wonder why this hasn't shown up in our testing or from other deployments yet though. One quick thing to test is to boot with nmi_watchdog=2 on the guest's kernel command line. That'll set the nmi_watchdog to NMI_LOCAL_APIC and call lapic_watchdog_stop instead (which I believe is a no-op, if not you can try adding nolapic to the guest kernel command line too to make sure it is). If booting with this (these) command line options makes all your guests happy over several reboot cycles then we can be be pretty confident that the issue is the apic_write and we'll start looking at writing a patch. thanks, Drew Hi Drew, I tried playing with 2.6.18-245.el5. Without that extra parameter, that kernel gets stuck during boot. Adding just the "nmi_watchdog=2" to the guest's command line allows it to boot, and it behaves consistently over several `xm destroy`/`xm create` cycles, so I guess you've indeed found the culprit. I'm not sure how relevant this is, but the physical machine on which this happens is an HP Proliant DL360 G5, dmidecode reports its BIOS version as "P58" from "08/03/2008". The CPUs are Intel Xeon E5420. However, I'm pretty sure I tried that even on some Dell or Supermicro machines (could check if it helps). Cheers, Jan Hi Jan, thanks for the additional testing. I actually believe this is a regression introduced from a hypervisor patch, rather than this kernel patch. I believe this is the culprit [xen] emulate injection of guest NMI which first appeared in the -222 build. Since this kernel patch you've pointed to first appeared in the -216 build, we could actually try a -216 build or anything < -222 and see if the problem is gone. Then see if the problem appears with -222, for our final confirmation. If this is the problem patch, then it looks like it may be possible to work around it by allocating at least two vcpus for the guest, i.e. change the config to be vcpus=2. Also, for the next round of tests please add 'loglvl=all guest_loglvl=all' to your hypervisor command line (xen.gz in grub) and reboot it. Then, after the failure grab 'xm dmesg' output. thanks, Drew (In reply to comment #3) > [xen] emulate injection of guest NMI > > which first appeared in the -222 build. Please note that at the time I first hit this issue, the dom0 was running kernel 2.6.18-194.32.1.el5xen, ie. a kernel which is not supposed to be affected. At that time, I also tried going between the -194.32.1.el5xen and -238.1.1.el5xen versions of the dom0 kernels, but did not see any difference -- no matter what version was running in the dom0, as soon as I switched to -238.1.1 inside the domU, the domU wouldn't boot anymore. > If this is the problem patch, then it looks like it may be possible to work > around it by allocating at least two vcpus for the guest, i.e. change the > config to be vcpus=2. I can confirm that when I try the -245.el5 kernel inside the domU (without any change in the dom0 at all, ie. still at -238.1.1.el5xen inside the dom0), change the vcpu count to 2 in the Xen config for that particular domU and remove the nmi_watchdog bit from its kernel command line, the guest boots properly. > Also, for the next round of tests please add 'loglvl=all guest_loglvl=all' to > your hypervisor command line (xen.gz in grub) and reboot it. Then, after the > failure grab 'xm dmesg' output. Do you still want me to reboot with that settings, or are the comments above enough? Having issues with the -194 dom0 is a strange data point, but I don't believe the apic_write(APIC_LVT0, APIC_DM_NMI... should have done anything at all without the patch I pointed to. So I'm inclined to believe the -194 dom0 problem was a different issue. The vcpu test also seems to confirm it's the hypervisor patch causing the issues (triggered by the kernel patch you pointed to). The extra logging could help us further confirm this (and fix it) as well, so please grab them if you can. OK. Where can I find the RPMs (or SRPMs) of the dom0 kernel and Xen combination I should try? Any specific domU options this time? Hi Jan, I've uploaded some rpms to here http://people.redhat.com/drjones/707966/ I'd appreciate the following tests - Install the -221 kernel/xen on your host and then add 'loglvl=all guest_loglvl=all' to the xen.gz command line. - Make sure you guest config only has one vcpu assigned to it - After booting up the host on the new kernel, then attempt to boot an HVM guest that has a >= -216 kernel (you can use the -264 that I put in the directly in order to use the very latest). I believe this will work. - Then install the -222 kernel/xen on the host and try booting the guest again. I believe this will fail. Please capture the logs from 'xm dmesg' after it fails (the loglvl=all guest_loglvl=all should still be there). - Then to be 100% thorough you can try installing the -264 kernel/xen on your host and repeat the boot test. I believe this will also fail, and the 'xm dmesg' logs should be similar. I'd like these logs as well though. Thanks for all the testing!! Drew It seems to me that the apic_write call is wrong. The APIC setup of Xen HVM guests will deliver external interrupts directly to the LAPIC, not to the IOAPIC. For this reason, the hvmloader will do apic_write(APIC_LVT0, APIC_DM_EXTINT) before starting Linux. Linux detects this, and does not mask the external interrupts. This is shown in dmesg during boot (loglevel=9 acpi=debug) with something like enabled ExtINT on CPU#0 ENABLING IO-APIC IRQs init IO_APIC IRQs IOAPIC (apicid-pin) 1-0, 1-16, [etc.] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 Testing the NMI watchdog starts by writing APIC_DM_NMI to the LVT0 register. In the old code, the register kept this value forever. Here is where something is missing in my theory, because I do not get how the old code can work. However, now the NMI watchdog code writes APIC_DM_NMI | APIC_LVT_MASKED and this is probably the source of the problem: the new value is effectively masking external interrupts including, guess what, the serial port and IDE controller interrupts. Note: with 5.7, it will work again by chance, because patch b41ec96 ([xen] x86/hvm: Enable delivering 8259 interrupts to VCPUs != 0, 2011-03-02) will default to delivering interrupts to VCPU#0. Before that patch, the interrupts simply will not be delivered. In any case, the correct fix is one of the following: 1) not muck with LVT0 at all on the grounds that it has always worked (!). Possibly, complete the above theory to understand _why_ it has always worked. 2) save the old value of LVT0 in enable_NMI_through_LVT0 and add a disable_NMI_through_LVT0 that restores it. By the way, note that depending on whether the apic2/pin2 is -1 or not in the above message, the routines that disconnect the APIC to prepare for reboot work very differently: - if it is -1, disconnect_bsp_APIC(0) is called. It will basically do apic_write(APIC_LVT0, APIC_DM_EXTINT) - if it is not -1, disable_IO_APIC will do the IOAPIC equivalent of apic_write(APIC_LVT0, APIC_DM_EXTINT). Then, disconnect_bsp_APIC(1) is called which will do apic_write(APIC_LVT0, APIC_LVT_MASKED). Now, the APIC_LVT_MASKED bit is set, so the delivery mode is not interesting. As such, the apic_write in the patch is equivalent to /* Disable LVT0 */ apic_write(APIC_LVT0, APIC_LVT_MASKED); which is the "wrong choice" when running under Xen. Also by the way, none of this IOAPIC business is done by the kernel when running as dom0---the hypervisor does it instead---which is why running the guest as a "nested" dom0 works. The text between "By the way" and "Also by the way" above doesn't make much sense and should not have been there. :) (In reply to comment #7) > - Install the -221 kernel/xen on your host and then add 'loglvl=all > guest_loglvl=all' to the xen.gz command line. > - Make sure you guest config only has one vcpu assigned to it > - After booting up the host on the new kernel, then attempt to boot an HVM > guest that has a >= -216 kernel (you can use the -264 that I put in the > directly in order to use the very latest). > > I believe this will work. Hi Andrew, sorry for delay. I've followed these instructions (ie. 2.6.18-221.el5xen in the dom0, the -264 in domU, boot arguments to Xen in dom0), but the guest is getting stuck immediately after the "Serial..." line. I'll attach the xm dmesg log shortly (bugzie won't allow me now). I haven't done anything else. Created attachment 502326 [details]
xm-dmesg-2.6.18-221.
Please note that I started the domU once, saw that it gets stuck, `xm destroy`ed it, started again, observed the same behavior and only then grabbed the `xm dmesg` log.
yeah, these are the interesting logs (XEN) vlapic.c:687:d2 Local APIC Write to read-only register 0x30 (XEN) vlapic.c:687:d2 Local APIC Write to read-only register 0x20 (XEN) vlapic.c:687:d2 Local APIC Write to read-only register 0x20 (XEN) vlapic.c:687:d2 Local APIC Write to read-only register 0x20 It looks like I stand corrected. The kernel patch alone appears to start the problems. We see that now that we're running without the HV NMI related patch, and still having problems. I uploaded kernel-2.6.18-215.el5.x86_64.rpm to the same place. This is the kernel build right before the patch you pointed to. To be completely thorough you could try it on your guest running this -221 host and see that it boots without any problem. However, looking at these logs we see the apic_write is certainly causing some havoc. (In reply to comment #12) > I uploaded kernel-2.6.18-215.el5.x86_64.rpm to the same place. This is the > kernel build right before the patch you pointed to. To be completely thorough > you could try it on your guest running this -221 host and see that it boots > without any problem. However, looking at these logs we see the apic_write is > certainly causing some havoc. I can confirm that the -215 guest boots fine on the -221 host, as you suspected. Please let me know if you could use any further testing. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. This problem can be reproduced on host CPU model Intel Xeon E5405 with guest conf as attachment 501071 [details] described.
Host: 2.6.18-238.1.1.el5xen
Guest: 2.6.18-238.1.1.el5
While, if add "apic = 1" in the conf, the guest can boot up successfully.
BTW, this problem not found on AMD host (tested on AMD Opteron 1216) even without "apic = 1" in the guest conf.
Created attachment 520358 [details]
amend NMI stuck printk patch so that APIC stuff runs only on bare-metal
fix up commit b1c317b so that this block is only run on bare-metal:
Also a little bit of extra code is in this patch to help clean-up a
failed nmi_watchdog check by disabling various registers as noticed
during testing.
I like Laszlo's patch, it's the simplest that can work. Perhaps it would be _too_ simple for upstream or even RHEL6, but it's better to be surgical in RHEL5. I tested the following configurations. I created the guests with virt-manager, and then updated the kernel in the RHEL-5.6. Guest Host ---------------------- ----------------------------------------- 2.6.18-238.24.1.el5xen 2.6.18-281.el5xen 2.6.18-238.el5 G1 boot OK boot OK 2.6.18-238.24.1.el5 G1 boot OK boot OK 2.6.18-274.el5 G1 boot OK boot OK G1 name = "rhel57-64bit-hvm-bz707966" uuid = "a0203b16-91a0-1958-27d6-f97e36cea95f" maxmem = 512 memory = 512 vcpus = 1 builder = "hvm" kernel = "/usr/lib/xen/boot/hvmloader" boot = "c" pae = 1 acpi = 1 apic = 1 localtime = 0 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" device_model = "/usr/lib64/xen/bin/qemu-dm" sdl = 0 vnc = 1 vncunused = 1 keymap = "en-us" disk = [ "file:/var/lib/xen/images/rhel57-64bit-hvm-bz707966.img,hda,w",",hdc:cdrom,r" ] vif = [ "mac=00:16:36:0c:d4:4b,bridge=xenbr0,script=vif-bridge" ] parallel = "none" serial = "pty" No problems seen. The host CPU is a Xeon W3550. I'll retry without apic=1. Commented out the acpi and apic lines in the guest config seen in comment 19, still can't reproduce the problem (host: 2.6.18-238.24.1.el5xen, guest: 2.6.18-238.24.1.el5). Am I doing something wrong? AFAICT b41ec96 was not backported to 5.6.z. Jan, can you try upgrading to 2.6.18-238.24.1? Thanks. (In reply to comment #20) > Jan, can you try upgrading to 2.6.18-238.24.1? Or can you please apply attachment 520358 [details] on top of whatever canned kernel fails for you and retest? Thank you! (In reply to comment #15) > This problem can be reproduced on host CPU model Intel Xeon E5405 with guest > conf as attachment 501071 [details] described. > > Host: 2.6.18-238.1.1.el5xen > Guest: 2.6.18-238.1.1.el5 > > While, if add "apic = 1" in the conf, the guest can boot up successfully. > > BTW, this problem not found on AMD host (tested on AMD Opteron 1216) even > without "apic = 1" in the guest conf. For completeness, I tried to reproduce the problem as follows: - host: W3550 CPU, 2.6.18-238.1.1.el5xen - guest: 2.6.18-238.1.1.el5; vm config: both with and without apic & acpi No hang. I finally managed to reproduce the hang on a Xeon E5405 host. Host: * -283 (hypervisor and dom0) Guests: * checked both -238.24.1 (most recent 5.6.z atm) and -283 (most recent 5.8 working build) * Guest config (commenting out acpi and apic is critical -- when I left those uncommented, the boots worked flawlessly): name = "rhel56-64bit-hvm-bz707966" uuid = "cc9c19ec-41d5-4f89-894b-d24e5260c1d3" maxmem = 512 memory = 512 vcpus = 1 builder = "hvm" kernel = "/usr/lib/xen/boot/hvmloader" boot = "c" pae = 1 # acpi = 1 # apic = 1 localtime = 0 on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" device_model = "/usr/lib64/xen/bin/qemu-dm" sdl = 0 vnc = 1 vncunused = 1 keymap = "en-us" disk = [ "file:/var/lib/xen/images/rhel56-64bit-hvm-bz707966.img,hda,w", ",hdc:cdrom,r" ] vif = [ "mac=00:16:3e:5e:89:15,bridge=xenbr0,script=vif-bridge" ] parallel = "none" serial = "pty" * hang reproduced under both guest kernels, after the following message was printed: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled Furthermore, the xm dmesg entries listed in comment 12 show up again: (XEN) vlapic.c:689:d6 Local APIC Write to read-only register 0x30 (XEN) vlapic.c:689:d6 Local APIC Write to read-only register 0x20 (XEN) vlapic.c:689:d6 Local APIC Write to read-only register 0x20 (XEN) vlapic.c:689:d6 Local APIC Write to read-only register 0x20 (In reply to comment #8) > Note: with 5.7, it will work again by chance, because patch b41ec96 ([xen] > x86/hvm: Enable delivering 8259 interrupts to VCPUs != 0, 2011-03-02) will > default to delivering interrupts to VCPU#0. Before that patch, the > interrupts simply will not be delivered. I might be testing under circumstances that are not valid for the quoted paragraph, but it doesn't seem to work like this. The hypervisor is -283 and the hang reproduces. (The first build to contain b41ec96 is -256.) I'll check my guest patch (= attachment 520358 [details]) on: - host = -283 - guest = -283 - acpi & apic commented out in the vm config > I'll check my guest patch (= attachment 520358 [details]) on:
> - host = -283
> - guest = -283
> - acpi & apic commented out in the vm config
Guest boot succeeds.
Here's a quick summary of what I've done again today: Host: 2.6.18-238.19.1.el5xen Guest: 2.6.18-238.5.1.el5 without "apic=1", with just "vcpus=1": won't boot without "apic=1", with "vcpus=2": boots with "apic=1", with just "vcpus=1": boots I'm on Scientific Linux, and I currently don't see a RPM for the -238.24.1 in there, and hence can't really test it. If you can build/provide one for me, I'll be happy to test it in the guest. I'm also happy to build my own RPM, but didn't fidn a suitable SRPM at ftp://ftp.redhat.com/redhat/linux/enterprise/5Server/en/os/SRPMS, sorry. Hello Jan, (In reply to comment #34) > I'm also happy to build my own RPM, but didn't fidn a suitable SRPM at > ftp://ftp.redhat.com/redhat/linux/enterprise/5Server/en/os/SRPMS, sorry. the patch for the guest (attachment 520358 [details]) also applies to 2.6.18-238.19.1.el5 too, and should work the same way. Created attachment 521671 [details]
make NMI_NONE the default watchdog in x86_64 hvm guests
x86 already defaults to NMI_NONE, there's no need to check if we're running as a guest.
x86 & x86_64: if the user specified "nmi_watchdog=...", the warning is warranted.
Comment on attachment 521671 [details]
make NMI_NONE the default watchdog in x86_64 hvm guests
looks good
I think this patch is the better approach. Cheers, Don (In reply to comment #37) > Created attachment 521671 [details] > make NMI_NONE the default watchdog in x86_64 hvm guests x86_64 hvm guest, apic=0 vm option * -284 hangs, * -284+patch works, * -284+patch + nmi_watchdog=1 warns (WARNING: CPU#0: NMI appears to be stuck (0->0)!) and hangs, * -284+patch + nmi_watchdog=2 doesn't warn and doesn't hang x86_64 hvm guest, apic=1 vm option -284 works, /proc/sys/kernel/nmi_watchdog says 0 -284+patch works, /proc/sys/kernel/nmi_watchdog says 0 (In reply to comment #37) > Created attachment 521671 [details] > make NMI_NONE the default watchdog in x86_64 hvm guests Sanity checked on 32-bit hvm guest (pae=1 vm option and -284+patch i686 PAE guest kernel): works with both apic=0 and apic=1 vm opts. Patch(es) available in kernel-2.6.18-285.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: A previously applied patch to help clean-up a failed nmi_watchdog check by disabling various registers caused single-vcpu Xen HVM guests to become unresponsive during boot when the host CPU was an Intel Xeon Processor E5405 or an Intel Xeon Processor E5420, and the VM configuration did not have the apic = 1 parameter set. With this update, NMI_NONE is the default watchdog on AMD64 HVM guests, thus, fixing this issue. Verify this problem with guest kernel 2.6.18-301.el5. And also reproduced with RHEL5.7 released kernel(274). Version: Host CPU model: Intel Xeon E5405 Host kernel: kernel-xen-2.6.18-300.el5 Guest kernel: kernel 2.6.18-301.el5 Verify Steps: 1. Set the guest conf with below two options specified: vcpus = 1 apic = 0 (or comment this option) 2. create the guest with above conf. Test result: Guest start up without stuck and can be logged in successfully. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0150.html |