Description of problem: failed to online cpu1 . Version-Release number of selected component (if applicable): kernel-2.6.18-164.1.1.el5.i686 How reproducible: 100% Steps to Reproduce: 1.start the kvm guest 2.echo 1 > /sys/devices/system/cpu/cpu1/online 3.echo 0 > /sys/devices/system/cpu/cpu1/online Actual results: CPU 1 is now offline SMP alternatives: switching to UP code SMP alternatives: switching to SMP code Booting processor 1/1 eip 11000 CPU 1 irqstacks, hard=c0762000 soft=c0742000 Initializing CPU#1 Calibrating delay using timer specific routine.. 5319.66 BogoMIPS (lpj=2659830) CPU: After generic identify, caps: 078bfbfd 2191abfd 00000000 00000000 80000001 00000000 00000000 CPU: After vendor identify, caps: 078bfbfd 2191abfd 00000000 00000000 80000001 00000000 00000000 CPU: L1 I cache: 32K, L1 D cache: 32K CPU: L2 cache: 2048K CPU: After all inits, caps: 078bf3fd 2191abfd 00000000 00000040 80000001 00000000 00000000 int3: 0000 [#1] SMP last sysfs file: /devices/system/cpu/cpu1/online Modules linked in: nfs fscache nfs_acl nls_utf8 ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 hidp rfcomm l2cap bluetooth dm_log_clustered(U) lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac ipv6 xfrm_nalgo crypto_api lp floppy joydev pcspkr virtio_pci virtio_ring virtio serio_raw i2c_piix4 i2c_core parport_pc parport 8139too 8139cp mii ide_cd cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 1 EIP: 0060:[<c0710494>] Tainted: G VLI EFLAGS: 00000046 (2.6.18-164.1.1.el5 #1) EIP is at kvmclock_init+0x1/0x92 eax: 00000002 ebx: 4b4d564b ecx: 564b4d56 edx: 0000004d esi: f7c83f55 edi: c0632ad8 ebp: c06ee900 esp: f7c83f44 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, ti=f7c83000 task=c20ef550 task.ti=f7c83000) Stack: c040defb 4b6611f9 564b4d56 4d564b4d 00000000 0000004d 564b4d56 4b4d564b 00000007 c06611f9 c06308be c06ee900 c040c031 c06611f9 00000001 c06ee900 00000100 00000000 c041632e 00000001 00000000 00000000 c041660c 00000000 Call Trace: [<c040defb>] init_hypervisor+0x80/0x93 [<c040c031>] identify_cpu+0x22e/0x27e [<c041632e>] smp_store_cpu_info+0x2c/0xa5 [<c041660c>] start_secondary+0xac/0x40c ======================= Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc <cc> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc EIP: [<c0710494>] kvmclock_init+0x1/0x92 SS:ESP 0068:f7c83f44 <0>Kernel panic - not syncing: Fatal exception Stuck ?? Inquiring remote APIC #1... ... APIC #1 ID: failed ... APIC #1 VERSION: failed ... APIC #1 SPIV: failed skipping cpu1, didn't come online Expected results: Additional info: only happen in i686 guest. x86_64 guest is OK the cli I boot guest : /usr/libexec/qemu-kvm -smp 2 -cpu qemu64,+sse2 -m 2048 -drive file=/data/images/images/pvclock.i686.30.raw -net nic,macaddr=00:21:9B:00:5F:02 -net tap,script=/data/images/qemu-ifup -drive file=/data/images/isos/0813.i386.boot.iso,media=cdrom -boot c -vnc :12 -monitor stdio -startdate now -usbdevice tablet -notify all
Ok, the problem here is actually quite simple. kvmclock_init comes from upstream marked as __init. But in RHEL kernels, the code that detects hypervisor is called everytime a cpu comes online, at cpu detection (from identify_cpu), and therefore, is marked as __cpuinit
Is the reproducer inside the guest or from the host?
inside kvm guest(In reply to comment #4) > Is the reproducer inside the guest or from the host? inside kvm guest
in kernel-2.6.18-166.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html