Bug 661211

Summary: rhel6 hvm guests get soft lockups if tsc fails
Product: Red Hat Enterprise Linux 5 Reporter: Binbin Yu <byu>
Component: xenAssignee: Paolo Bonzini <pbonzini>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.7CC: drjones, jzheng, leiwang, lersek, mrezanin, pbonzini, qguan, qwan, tburke, xen-maint, yuzhang, yuzhou
Target Milestone: ---Keywords: ReleaseNotes
Target Release: 5.8   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: xen-3.0.3-133.el5 Doc Type: Bug Fix
Doc Text:
In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot. In some cases, drifting may start after migration of the virtual machine to a host with different speed. This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor. To work around this, add "clocksource=acpi_pm" or "clocksource=jiffies" to the kernel command line for the guest. Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.
Story Points: ---
Clone Of:
: 745713 (view as bug list) Environment:
Last Closed: 2012-02-21 05:54:39 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 745713    
Bug Blocks: 699611    
Attachments:
Description Flags
config file
none
console output when failing boot
none
xend log for failing boot
none
xm dmesg for failing boot
none
boot log of xen132
none
boot log of xen-3.0.3-132.el5661211
none
20111012-661211-acpi_pm-boot-log
none
20111012-661211-hpet-0-boot-log none

Description Binbin Yu 2010-12-08 07:08:11 UTC
Description of problem:
We can't boot RHEL6 32bit hvm guest with 4 vcpus on the large x86_64 host, even booting failed with vcpus=2. Here, the RHEL6 32bit hvm guest installation via pxe or iso image failed either.


Version-Release number of selected component (if applicable):
host: x86_64, kernel-xen-2.6.18-236.el5, xen-3.0.3-120.el5, 96-core, 980G ram
guest: RHEL-Server-6.0-32-20100922.1-hvm.raw


How reproducible:
100%


Steps to Reproduce:
1. set vcpus=4 in the config file and do "xm create $config -c"
2.

 
Actual results:
Can't boot RHEL6 32bit hvm guest with 4 vcpus, printing lots of "BUG: soft lockup - CPU#0 stuck for 108s" messages.


Expected results:
RHEL6 32bit hvm guest should boot normally and work well.


Additional info:
1. We can boot RHEL6 32bit hvm guest with 4 vcpus on my own work-machine, which has 4 cpu and 8G ram.
2. Try this case on the large host with kernel-xen-231, and it failed either.

Comment 1 Binbin Yu 2010-12-08 07:14:37 UTC
Created attachment 467390 [details]
config file

Comment 2 Binbin Yu 2010-12-08 07:17:24 UTC
Created attachment 467392 [details]
console output when failing boot

Comment 3 Binbin Yu 2010-12-08 07:18:07 UTC
Created attachment 467394 [details]
xend log for failing boot

Comment 4 Binbin Yu 2010-12-08 07:18:41 UTC
Created attachment 467395 [details]
xm dmesg for failing boot

Comment 5 Binbin Yu 2010-12-08 07:19:50 UTC
xm info on the large host:

[root@intel-e7450-512-1 xen]# xm info
host                   : intel-e7450-512-1.englab.nay.redhat.com
release                : 2.6.18-236.el5xen
version                : #1 SMP Mon Dec 6 19:01:22 EST 2010
machine                : x86_64
nr_cpus                : 96
nr_nodes               : 1
sockets_per_node       : 16
cores_per_socket       : 6
threads_per_core       : 1
cpu_mhz                : 2398
hw_caps                : bfebfbff:20100800:00000000:00000940:000ce3bd:00000000:00000001
total_memory           : 982014
free_memory            : 930427
node_to_cpu            : node0:0-95
xen_major              : 3
xen_minor              : 1
xen_extra              : .2-236.el5
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64 
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.1.2 20080704 (Red Hat 4.1.2-48)
cc_compile_by          : mockbuild
cc_compile_domain      : redhat.com
cc_compile_date        : Mon Dec  6 18:38:03 EST 2010
xend_config_format     : 2

Comment 6 Laszlo Ersek 2010-12-08 10:53:35 UTC
1. Can you reproduce the "INFO: task khubd:45 blocked for more than 120 seconds." message if you retry the boot?

2. Can you please get a crash dump of the guest?

3. It would be interesting to install "hwloc" (http://www.open-mpi.org/projects/hwloc/) on the host from source, and run "lstopo host.png". If you have time for this, please attach "host.png" (shouldn't be big). Thanks.

Comment 7 Andrew Jones 2010-12-08 11:12:19 UTC
Just to be clear, since it wasn't written anywhere explicitly, the guest can be booted with 1 vcpu, correct? Are you attempting to boot with or without pv-on-hvm drivers? Or both?

Comment 8 Yufang Zhang 2010-12-08 11:21:20 UTC
(In reply to comment #7)
> Just to be clear, since it wasn't written anywhere explicitly, the guest can be
> booted with 1 vcpu, correct?

Yes. The guest can be booted with 1 vcpu.

> Are you attempting to boot with or without
> pv-on-hvm drivers? Or both?

This would be confirmed by byu tomorrow.

Comment 9 Lei Wang 2010-12-09 07:56:51 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Just to be clear, since it wasn't written anywhere explicitly, the guest can be
> > booted with 1 vcpu, correct?
> 
> Yes. The guest can be booted with 1 vcpu.
> 
> > Are you attempting to boot with or without
> > pv-on-hvm drivers? Or both?
> 
> This would be confirmed by byu tomorrow.

We booted the guest with pv_on_hvm=enable when the bug occurred.

We will try to provide more info according to comment 6 and comment 7, maybe next week, for the big machine is occupied by other section now.

Comment 10 Andrew Jones 2010-12-09 09:22:49 UTC
Ok, so I understand that we haven't tried > 1 vcpu without pv_on_hvm yet. We should definitely try that when the machine is returned. Thanks!

Comment 11 Qixiang Wan 2011-02-09 09:09:18 UTC
(In reply to comment #10)
> Ok, so I understand that we haven't tried > 1 vcpu without pv_on_hvm yet. We
> should definitely try that when the machine is returned. Thanks!

the problem also exist when start guest without pv_on_hvm=enable
now only can be reproduced on the machine which reporter used

Comment 14 Andrew Jones 2011-02-10 17:31:46 UTC
This appears to be clocksource related. When I reproduced the problem I saw we get soft lockups during boot that eventually make the kernel give up. I looked closer at the logs that came out prior to the first soft lockup and saw

...
TSC synchronization [CPU#0 -> CPU#1]:
Measured 20035715238 cycles TSC warp between CPUs, turning off TSC clock.
Marking TSC unstable due to check_tsc_sync_source failed
...
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
...
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
hpet0: 3 comparators, 64-bit 74.964345 MHz counter
Switching to clocksource hpet
BUG: soft lockup - CPU#0 stuck for 106s! [swapper:1]
...

adding clocksource=jiffies to the command line allowed me to boot and use the machine fine with 4 vcpus, both with and without pv_on_hvm drivers enabled.

This problem isn't seen on most machines (like my test box) because the tsc generally works on them (like it does on my box).

I also tried with the latest 6.1 beta kernel on this machine. The problem reproduced without clocksource=jiffies and went away with clocksource=jiffies. That's not too unexpected considering we haven't changed anything with clocksource related code. It does mean I need to look into a 6.1 fix. I'll start by testing an upstream kernel to see if a fix exists.

Comment 15 Andrew Jones 2011-02-10 17:41:07 UTC
Another note is that I installed a 64-bit rhel 6.1 beta HVM guest and got soft lockups with it as well. So this isn't 32-bit specific.

Comment 16 Andrew Jones 2011-02-14 14:12:30 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
RHEL6 Xen HVM guests may experience constant soft lockups when booting on some machines. A possible workaround is to boot with clocksource=jiffies on the guest's kernel command line.

Comment 17 Andrew Jones 2011-02-14 16:16:34 UTC
The latest Linux kernel (2.6.38-rc4+) also detects warp for the tsc and switches to hpet on this machine and eventually fails to boot. The symptoms of the boot failure are different, but if you look at where it BUGs it's likely due to a softirq not firing. It can be worked-around in the same way 'clocksource=jiffies'.

Another note is that with both the rhel kernel and the latest upstream kernel the hpet appears to be stable and usable if the vcpus are pinned to pcpus that are on the same node of this numa machine.

Comment 20 Paolo Bonzini 2011-02-17 14:27:50 UTC
The problem seems to be that the HPET is unreliable on RHEL5 Xen.  Perhaps we can blacklist it in the kernel, so that it goes straight from tsc to jiffies?

Comment 22 Andrew Jones 2011-06-10 12:37:56 UTC
*** Bug 712319 has been marked as a duplicate of this bug. ***

Comment 23 Paolo Bonzini 2011-07-15 12:11:57 UTC
This needs a patch like this:

diff --git a/tools/ioemu/hw/piix4acpi.c b/tools/ioemu/hw/piix4acpi.c
index f607074..0229773 100644
--- a/tools/ioemu/hw/piix4acpi.c
+++ b/tools/ioemu/hw/piix4acpi.c
@@ -532,7 +532,7 @@ void pci_piix4_acpi_init(PCIBus *bus, int devfn)
     pci_conf[0x01] = 0x80;
     pci_conf[0x02] = 0x13;
     pci_conf[0x03] = 0x71;
-    pci_conf[0x08] = 0x01;  /* B0 stepping */
+    pci_conf[0x08] = 0x03;
     pci_conf[0x09] = 0x00;  /* base class */
     pci_conf[0x0a] = 0x80;  /* Sub class */
     pci_conf[0x0b] = 0x06;

which is a backport of this upstream qemu commit:

commit a78b03cb6985466beb006b4e0eec4ba22d537c43
Author: balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
Date:   Mon Jan 14 03:43:18 2008 +0000

    Bump ACPI/SMBus PIIX4 controller revision to 3 (Marcelo Tosatti).

Comment 24 Paolo Bonzini 2011-07-15 12:12:52 UTC
Deleted Technical Notes Contents.

Old Contents:
RHEL6 Xen HVM guests may experience constant soft lockups when booting on some machines. A possible workaround is to boot with clocksource=jiffies on the guest's kernel command line.

Comment 26 Miroslav Rezanina 2011-07-20 09:02:38 UTC
Hi, can you please retest the problem with RHEL 5.7 xen package and with 
package from brew build:

https://brewweb.devel.redhat.com/taskinfo?taskID=3503146

Is the problem solved in brew build?

Comment 27 Yuyu Zhou 2011-07-21 03:29:53 UTC
Hi,Miroslav

The problem still exists in both xen132 and xen-3.0.3-132.el5661211.

Boot logs attached.

Yuyu Zhou

Comment 28 Yuyu Zhou 2011-07-21 03:30:44 UTC
Created attachment 514114 [details]
boot log of xen132

Comment 29 Yuyu Zhou 2011-07-21 03:31:19 UTC
Created attachment 514115 [details]
boot log of xen-3.0.3-132.el5661211

Comment 39 Paolo Bonzini 2011-08-25 15:51:08 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot.  In some cases, drifting may start after migration of the virtual machine to a host with different speed.  This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor.  To work around this, add "clocksource=acpi_pm" to the kernel command line for the guest.  Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.

Comment 42 Paolo Bonzini 2011-10-12 09:48:35 UTC
Yes, "acpi_pm" should work, what does the call trace look like?  Same as comment 2? Also, can you attach the boot log for "hpet=0"?

Changing the technote to jiffies doesn't sound too bad anyway, but I'd like to understand what's going on.

Comment 43 Qin Guan 2011-10-12 10:10:49 UTC
Created attachment 527662 [details]
20111012-661211-acpi_pm-boot-log

Call Trace when set "clocksource=acpi_pm".

Comment 44 Qin Guan 2011-10-12 10:13:03 UTC
Created attachment 527663 [details]
20111012-661211-hpet-0-boot-log

Boot log for hpet=0.

Comment 45 Andrew Jones 2011-10-12 10:32:56 UTC
Setting clocksource=acpi_pm isn't enough to override the hpet. We have this in the boot log before the traces start

...
Switching to clocksource hpet
...
Override clocksource acpi_pm is not HRT compatible. Cannot switch while in HRT/NOHZ mode

Comment 46 Paolo Bonzini 2011-10-12 11:51:18 UTC
Then it must be jiffies.

Comment 47 Paolo Bonzini 2011-10-12 11:51:18 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot.  In some cases, drifting may start after migration of the virtual machine to a host with different speed.  This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor.  To work around this, add "clocksource=acpi_pm" to the kernel command line for the guest.  Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.+In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot.  In some cases, drifting may start after migration of the virtual machine to a host with different speed.  This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor.  To work around this, add "clocksource=jiffies" to the kernel command line for the guest.  Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.

Comment 48 Paolo Bonzini 2011-10-12 12:00:50 UTC
I put jiffies in the meanwhile.  However, I found this too: https://lkml.org/lkml/2011/5/19/490 and I'll brew a kernel for testing soon.  If that fixes acpi_pm, we should include that patch in RHEL6 too.

Comment 49 Andrew Jones 2011-10-12 13:04:19 UTC
Just putting nohpet on the guest's kernel command line might also work.

Comment 50 Paolo Bonzini 2011-10-13 07:30:27 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot.  In some cases, drifting may start after migration of the virtual machine to a host with different speed.  This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor.  To work around this, add "clocksource=jiffies" to the kernel command line for the guest.  Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.+In some cases, Red Hat Enterprise Linux 6 guests running fully-virtualized under Red Hat Enterprise Linux 5 experience time drift or fail to boot.  In some cases, drifting may start after migration of the virtual machine to a host with different speed.  This is due to limitations in the Red Hat Enterprise Linux 5 Xen hypervisor.  To work around this, add "clocksource=acpi_pm" or "clocksource=jiffies" to the kernel command line for the guest.  Alternatively, if running under Red Hat Enterprise Linux 5.7 or newer, locate the guest configuration file for the guest and add "hpet=0" there.

Comment 51 Paolo Bonzini 2011-10-19 14:27:08 UTC
Documented at https://access.redhat.com/kb/docs/DOC-65074

Comment 52 Qin Guan 2012-01-04 07:51:48 UTC
Verify this problem with 2.6.18-302.el5xen.

Version:
kernel-xen-2.6.18-302.el5
xen-3.0.3-135.el5
xen-libs-3.0.3-135.el5

Host CPU: Intel E7450

Steps:
1. Create a RHEL6.2 HVM guest with vcpus=8
2. Guest Call Trace with the message "BUG: soft lockup - CPU#5 stuck" in the console when no clocksource specified
3. Guest startup successfully when specify the clocksource in one of the following ways:
- Set "hpet=0" in guest conf
- Set "clocksource=acpi_pm" to guest kernel command line
- Set "clocksource=jiffies" to guest kernel command line
- Set "nohpet" to guest kernel command line

Comment 53 errata-xmlrpc 2012-02-21 05:54:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0160.html