Bug 698842 - kvmclock: MP-BIOS bug: 8254 timer not connected to IO-APIC
Summary: kvmclock: MP-BIOS bug: 8254 timer not connected to IO-APIC
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Amos Kong
QA Contact: Zhouping Liu
URL:
Whiteboard:
Depends On:
Blocks: 707820
TreeView+ depends on / blocked
 
Reported: 2011-04-21 23:51 UTC by Amos Kong
Modified: 2018-11-29 20:13 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 707820 (view as bug list)
Environment:
Last Closed: 2012-02-21 03:46:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
draft patch from amos kong (1.50 KB, application/x-gzip)
2011-06-17 11:50 UTC, Amos Kong
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2012:0150 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 07:35:24 UTC

Description Amos Kong 2011-04-21 23:51:24 UTC
Description of problem:
Boot up a rhel5.6 guest, using kvmclock, guest always panic.

ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter

1. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10
panic

2. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200
successfully

3. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10 noapic
successfully


Version-Release number of selected component (if applicable):
guest kernel: 2.6.18-238.5.1.el5
host kernel:2.6.32-131.0.1.el6.x86_64
qemu-kvm-0.12.1.2-2.158.el6.x86_64


How reproducible:
always

Steps to Reproduce:
1. Boot up a rhel5.6 guest, using kvmclock. guest kernel parameters(divider=10)
# qemu-kvm -cpu cpu64-rhel6,+sse2,+x2apic -rtc base=utc,clock=host,driftfix=slew ... 
  
Actual results:
guest panic

Expected results:
guest can boot up successfully

Additional info:

# qemu-kvm -name vm1 -chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20110420-105736-3Eig,server,nowait -mon chardev=qmp_monitor_id_qmpmonitor1,mode=control -chardev socket,id=serial_id_20110420-105736-3Eig,path=/tmp/serial-20110420-105736-3Eig,server,nowait -device isa-serial,chardev=serial_id_20110420-105736-3Eig -drive file=/home/devel/autotest-devel/client/tests/kvm/images/RHEL-Server-5.6-64-virtio.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idS1QAcm,mac=9a:8a:ec:e1:df:1a,id=ndev00idS1QAcm,bus=pci.0,addr=0x3 -netdev tap,id=idS1QAcm,vhost=on,ifname=t0-105736-3Eig,script=/home/devel/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :1 -rtc base=utc,clock=host,driftfix=slew -M rhel6.1.0 -boot order=cdn,once=c,menu=off -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm

Comment 1 Amos Kong 2011-04-21 23:53:47 UTC
When I disable kvmclock cpuflag in qemu cmdline, guest can boot up successfully.

qemu-kvm ... -cpu cpu64-rhel6,+sse2,+x2apic,-kvmclock 

1. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200
divider=10
successfully

2. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200
successfully

3. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200
divider=10 noapic
successfully

Comment 3 Zachary Amsden 2011-06-07 00:48:31 UTC
Going to need to know host hardware platform; also need to check if this is the latest guest kernel available.

Please also indicate if the guest is 32-bit or 64-bit rhel6;

in all probability this is a guest bug and not a blocker for kvm, but I'll leave it as proposed for 6.2 until we are certain.

Comment 4 Amos Kong 2011-06-08 09:52:45 UTC
It can only be reproduced with RHEL-5-64 guest. 100% 
guest kernel cmdline:   ... clocksource=kvmclock divider=10

guest Kernel: 2.6.18-262.el5
host Kernel: 2.6.32-150.el6.x86_64
qemu-kvm: qemu-kvm-0.12.1.2-2.162.el6.x86_64

Comment 5 Zachary Amsden 2011-06-08 16:20:07 UTC
Can you remove the divider=10 from the guest kernel command line for the kvmclock choice?  It should do nothing, but there is always a chance that it is still dividing down the guest clock internally and causing some sort of failure.

Pretty certain this is a guest bug, will have to go look at the 5.6 64-bit code to see what's going on.

Comment 6 Amos Kong 2011-06-16 00:30:36 UTC
Have talked with Zachary, this should be a bug of rhel5-64 guest,

It probably causes the time keeping to count 1/10th the number of interrupts... or sets the PIT rate too low during boot.
It may gets divided by DIVIDER or uses a HZ variable incorrectly,
the other alternative is to detect you are running in KVM and skip the test

Since there's an easy workaround, we can try to fix it in 5.8 cycle, and close this bug as WONTFIX.

Comment 7 Dor Laor 2011-06-16 10:11:57 UTC
Let's try fixing it in 5.8 since divider=10 reduces the guest wakeups by order of 10. In the mean time, please add CFFR (https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes) notes to the BZ

Comment 8 Amos Kong 2011-06-17 11:46:55 UTC
I add a debug a debug sentence in timer_irq_works(), and calling this functions many times at the head of check_timer().
After tested many times, I found that the jiffies doesn't change when first calling timer_irq_works().
So I've a resolution of this bug, just add a small delay for kvmclock before real calling timer_irq_works().

1. debug patch:
diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c
index 9a764b9..2f0cf38 100644
--- a/arch/x86_64/kernel/io_apic.c
+++ b/arch/x86_64/kernel/io_apic.c
@@ -1479,7 +1484,8 @@ static int __init timer_irq_works(void)
         */
 
        /* jiffies wrap? */
+        printk("t1: %lu, jiffies: %lu, \t%lu\n", t1, jiffies, jiffies-t1);
        if (jiffies - t1 > 4)
                return 1;
        return 0;
 }
@@ -1990,6 +1990,16 @@ static inline void check_timer(void)
        apic_printk(APIC_VERBOSE,KERN_INFO "..TIMER: vector=0x%02X apic1=%d pin1=%d apic2=%d pin2=%d\n",
                vector, apic1, pin1, apic2, pin2);
 
+       timer_irq_works();
+       timer_irq_works();
+       timer_irq_works();
+       ....
+       ....
        if (pin1 != -1) {
                /*
                 * Ok, does IRQ0 through the IOAPIC work?

2. dmesg info:
ACPI: Core revision 20060707
t1: 4294667566, jiffies: 4294667566,    0
t1: 4294667566, jiffies: 4294667576,    10
t1: 4294667576, jiffies: 4294667596,    20
t1: 4294667596, jiffies: 4294667606,    10
t1: 4294667606, jiffies: 4294667616,    10
t1: 4294667616, jiffies: 4294667626,    10
t1: 4294667626, jiffies: 4294667636,    10
t1: 4294667636, jiffies: 4294667656,    20
t1: 4294667656, jiffies: 4294667666,    10
t1: 4294667666, jiffies: 4294667676,    10
t1: 4294667676, jiffies: 4294667686,    10

3. my patch would be attached.

Comment 9 Amos Kong 2011-06-17 11:50:28 UTC
Created attachment 505263 [details]
draft patch from amos kong

0001-io_apic-Make-kernel-option-no_timer_check-always-wor.patch
0002-io_apic-Comparing-jiffies-with-other-values-by-time_.patch
0003-io_apic-Add-a-small-delay-before-real-calling-timer_.patch

Comment 10 Amos Kong 2011-06-18 01:37:52 UTC
Brew build: https://brewweb.devel.redhat.com/taskinfo?taskID=3403465

Comment 11 Prarit Bhargava 2011-06-23 17:42:26 UTC
Amos,

From a bare-metal point of view,

0001-io_apic-Make-kernel-option-no_timer_check-always-wor.patch

-               if (!no_timer_check && timer_irq_works()) {

Will call timer_irq_works if no_timer_check == 0

+               if (timer_irq_works()) {

Will only run the *core* of timer_irq_works if no_timer_check == 0.

So this is okay.

Ack.

0002-io_apic-Comparing-jiffies-with-other-values-by-time_.patch

Ack.

0003-io_apic-Add-a-small-delay-before-real-calling-timer_.patch

Only effects kvm.  But you spelled period as periad ;)

Ack (if you fix the spelling mistake).

P.

Comment 16 Jarod Wilson 2011-08-23 14:00:49 UTC
Patch(es) available in kernel-2.6.18-282.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 18 Zhouping Liu 2011-10-18 12:01:15 UTC
thanks Amos, I reproduced the panic on these version:
qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64
host kernel: kernel-2.6.32-182.el6
guest kernel:  kernel-2.6.18-238.5.1.el5 x86_64
               kernel-2.6.18-274.el5 x86_64
with the console message:
Kernel panic - not sysncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter

verified it on:
qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64
host kernel: kernel-2.6.32-182.el6
guest kernel: kernel-2.6.18-285.el5 x86_64
no panic, boot successfully.

so move the status to VERIFIED.

Comment 19 errata-xmlrpc 2012-02-21 03:46:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html


Note You need to log in before you can comment on or make changes to this bug.