Description of problem: Boot up a rhel5.6 guest, using kvmclock, guest always panic. ACPI: Core revision 20060707 ..MP-BIOS bug: 8254 timer not connected to IO-APIC Kernel panic - not syncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter 1. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10 panic 2. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 successfully 3. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10 noapic successfully Version-Release number of selected component (if applicable): guest kernel: 2.6.18-238.5.1.el5 host kernel:2.6.32-131.0.1.el6.x86_64 qemu-kvm-0.12.1.2-2.158.el6.x86_64 How reproducible: always Steps to Reproduce: 1. Boot up a rhel5.6 guest, using kvmclock. guest kernel parameters(divider=10) # qemu-kvm -cpu cpu64-rhel6,+sse2,+x2apic -rtc base=utc,clock=host,driftfix=slew ... Actual results: guest panic Expected results: guest can boot up successfully Additional info: # qemu-kvm -name vm1 -chardev socket,id=qmp_monitor_id_qmpmonitor1,path=/tmp/monitor-qmpmonitor1-20110420-105736-3Eig,server,nowait -mon chardev=qmp_monitor_id_qmpmonitor1,mode=control -chardev socket,id=serial_id_20110420-105736-3Eig,path=/tmp/serial-20110420-105736-3Eig,server,nowait -device isa-serial,chardev=serial_id_20110420-105736-3Eig -drive file=/home/devel/autotest-devel/client/tests/kvm/images/RHEL-Server-5.6-64-virtio.qcow2,index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,netdev=idS1QAcm,mac=9a:8a:ec:e1:df:1a,id=ndev00idS1QAcm,bus=pci.0,addr=0x3 -netdev tap,id=idS1QAcm,vhost=on,ifname=t0-105736-3Eig,script=/home/devel/autotest-devel/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -vnc :1 -rtc base=utc,clock=host,driftfix=slew -M rhel6.1.0 -boot order=cdn,once=c,menu=off -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm
When I disable kvmclock cpuflag in qemu cmdline, guest can boot up successfully. qemu-kvm ... -cpu cpu64-rhel6,+sse2,+x2apic,-kvmclock 1. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10 successfully 2. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 successfully 3. ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,115200 divider=10 noapic successfully
Going to need to know host hardware platform; also need to check if this is the latest guest kernel available. Please also indicate if the guest is 32-bit or 64-bit rhel6; in all probability this is a guest bug and not a blocker for kvm, but I'll leave it as proposed for 6.2 until we are certain.
It can only be reproduced with RHEL-5-64 guest. 100% guest kernel cmdline: ... clocksource=kvmclock divider=10 guest Kernel: 2.6.18-262.el5 host Kernel: 2.6.32-150.el6.x86_64 qemu-kvm: qemu-kvm-0.12.1.2-2.162.el6.x86_64
Can you remove the divider=10 from the guest kernel command line for the kvmclock choice? It should do nothing, but there is always a chance that it is still dividing down the guest clock internally and causing some sort of failure. Pretty certain this is a guest bug, will have to go look at the 5.6 64-bit code to see what's going on.
Have talked with Zachary, this should be a bug of rhel5-64 guest, It probably causes the time keeping to count 1/10th the number of interrupts... or sets the PIT rate too low during boot. It may gets divided by DIVIDER or uses a HZ variable incorrectly, the other alternative is to detect you are running in KVM and skip the test Since there's an easy workaround, we can try to fix it in 5.8 cycle, and close this bug as WONTFIX.
Let's try fixing it in 5.8 since divider=10 reduces the guest wakeups by order of 10. In the mean time, please add CFFR (https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes) notes to the BZ
I add a debug a debug sentence in timer_irq_works(), and calling this functions many times at the head of check_timer(). After tested many times, I found that the jiffies doesn't change when first calling timer_irq_works(). So I've a resolution of this bug, just add a small delay for kvmclock before real calling timer_irq_works(). 1. debug patch: diff --git a/arch/x86_64/kernel/io_apic.c b/arch/x86_64/kernel/io_apic.c index 9a764b9..2f0cf38 100644 --- a/arch/x86_64/kernel/io_apic.c +++ b/arch/x86_64/kernel/io_apic.c @@ -1479,7 +1484,8 @@ static int __init timer_irq_works(void) */ /* jiffies wrap? */ + printk("t1: %lu, jiffies: %lu, \t%lu\n", t1, jiffies, jiffies-t1); if (jiffies - t1 > 4) return 1; return 0; } @@ -1990,6 +1990,16 @@ static inline void check_timer(void) apic_printk(APIC_VERBOSE,KERN_INFO "..TIMER: vector=0x%02X apic1=%d pin1=%d apic2=%d pin2=%d\n", vector, apic1, pin1, apic2, pin2); + timer_irq_works(); + timer_irq_works(); + timer_irq_works(); + .... + .... if (pin1 != -1) { /* * Ok, does IRQ0 through the IOAPIC work? 2. dmesg info: ACPI: Core revision 20060707 t1: 4294667566, jiffies: 4294667566, 0 t1: 4294667566, jiffies: 4294667576, 10 t1: 4294667576, jiffies: 4294667596, 20 t1: 4294667596, jiffies: 4294667606, 10 t1: 4294667606, jiffies: 4294667616, 10 t1: 4294667616, jiffies: 4294667626, 10 t1: 4294667626, jiffies: 4294667636, 10 t1: 4294667636, jiffies: 4294667656, 20 t1: 4294667656, jiffies: 4294667666, 10 t1: 4294667666, jiffies: 4294667676, 10 t1: 4294667676, jiffies: 4294667686, 10 3. my patch would be attached.
Created attachment 505263 [details] draft patch from amos kong 0001-io_apic-Make-kernel-option-no_timer_check-always-wor.patch 0002-io_apic-Comparing-jiffies-with-other-values-by-time_.patch 0003-io_apic-Add-a-small-delay-before-real-calling-timer_.patch
Brew build: https://brewweb.devel.redhat.com/taskinfo?taskID=3403465
Amos, From a bare-metal point of view, 0001-io_apic-Make-kernel-option-no_timer_check-always-wor.patch - if (!no_timer_check && timer_irq_works()) { Will call timer_irq_works if no_timer_check == 0 + if (timer_irq_works()) { Will only run the *core* of timer_irq_works if no_timer_check == 0. So this is okay. Ack. 0002-io_apic-Comparing-jiffies-with-other-values-by-time_.patch Ack. 0003-io_apic-Add-a-small-delay-before-real-calling-timer_.patch Only effects kvm. But you spelled period as periad ;) Ack (if you fix the spelling mistake). P.
Patch(es) available in kernel-2.6.18-282.el5 You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5 Detailed testing feedback is always welcomed.
thanks Amos, I reproduced the panic on these version: qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 host kernel: kernel-2.6.32-182.el6 guest kernel: kernel-2.6.18-238.5.1.el5 x86_64 kernel-2.6.18-274.el5 x86_64 with the console message: Kernel panic - not sysncing: IO-APIC + timer doesn't work! Try using the 'noapic' kernel parameter verified it on: qemu-kvm-0.12.1.2-2.160.el6_1.8.x86_64 host kernel: kernel-2.6.32-182.el6 guest kernel: kernel-2.6.18-285.el5 x86_64 no panic, boot successfully. so move the status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-0150.html