Bug 523450 - cpu1 didn't come online in a kvm i686 guest
Summary: cpu1 didn't come online in a kvm i686 guest
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Glauber Costa
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 524151 528898
TreeView+ depends on / blocked
 
Reported: 2009-09-15 14:58 UTC by lihuang
Modified: 2010-03-30 07:42 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:42:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description lihuang 2009-09-15 14:58:12 UTC
Description of problem:
failed to online cpu1 .

Version-Release number of selected component (if applicable):
kernel-2.6.18-164.1.1.el5.i686

How reproducible:
100%

Steps to Reproduce:
1.start the kvm guest
2.echo 1 > /sys/devices/system/cpu/cpu1/online
3.echo 0 > /sys/devices/system/cpu/cpu1/online
  
Actual results:
CPU 1 is now offline
SMP alternatives: switching to UP code
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 11000
CPU 1 irqstacks, hard=c0762000 soft=c0742000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5319.66 BogoMIPS (lpj=2659830)
CPU: After generic identify, caps: 078bfbfd 2191abfd 00000000 00000000 80000001 00000000 00000000
CPU: After vendor identify, caps: 078bfbfd 2191abfd 00000000 00000000 80000001 00000000 00000000
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 2048K
CPU: After all inits, caps: 078bf3fd 2191abfd 00000000 00000040 80000001 00000000 00000000
int3: 0000 [#1]
SMP 
last sysfs file: /devices/system/cpu/cpu1/online
Modules linked in: nfs fscache nfs_acl nls_utf8 ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 hidp rfcomm l2cap bluetooth dm_log_clustered(U) lockd sunrpc ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac ipv6 xfrm_nalgo crypto_api lp floppy joydev pcspkr virtio_pci virtio_ring virtio serio_raw i2c_piix4 i2c_core parport_pc parport 8139too 8139cp mii ide_cd cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    1
EIP:    0060:[<c0710494>]    Tainted: G      VLI
EFLAGS: 00000046   (2.6.18-164.1.1.el5 #1) 
EIP is at kvmclock_init+0x1/0x92
eax: 00000002   ebx: 4b4d564b   ecx: 564b4d56   edx: 0000004d
esi: f7c83f55   edi: c0632ad8   ebp: c06ee900   esp: f7c83f44
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, ti=f7c83000 task=c20ef550 task.ti=f7c83000)
Stack: c040defb 4b6611f9 564b4d56 4d564b4d 00000000 0000004d 564b4d56 4b4d564b 
       00000007 c06611f9 c06308be c06ee900 c040c031 c06611f9 00000001 c06ee900 
       00000100 00000000 c041632e 00000001 00000000 00000000 c041660c 00000000 
Call Trace:
 [<c040defb>] init_hypervisor+0x80/0x93
 [<c040c031>] identify_cpu+0x22e/0x27e
 [<c041632e>] smp_store_cpu_info+0x2c/0xa5
 [<c041660c>] start_secondary+0xac/0x40c
 =======================
Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc <cc> cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 
EIP: [<c0710494>] kvmclock_init+0x1/0x92 SS:ESP 0068:f7c83f44
 <0>Kernel panic - not syncing: Fatal exception
 Stuck ??
Inquiring remote APIC #1...
... APIC #1 ID: failed
... APIC #1 VERSION: failed
... APIC #1 SPIV: failed
skipping cpu1, didn't come online

Expected results:


Additional info:
only happen in i686 guest.
x86_64 guest is OK 


the cli I boot guest :
/usr/libexec/qemu-kvm -smp 2 -cpu qemu64,+sse2 -m 2048 -drive file=/data/images/images/pvclock.i686.30.raw -net nic,macaddr=00:21:9B:00:5F:02 -net tap,script=/data/images/qemu-ifup -drive file=/data/images/isos/0813.i386.boot.iso,media=cdrom -boot c -vnc :12 -monitor stdio -startdate now -usbdevice tablet -notify all

Comment 1 Glauber Costa 2009-09-15 15:49:23 UTC
Ok, the problem here is actually quite simple.

kvmclock_init comes from upstream marked as __init.

But in RHEL kernels, the code that detects hypervisor is called everytime a cpu comes online, at cpu detection (from identify_cpu), and therefore, is marked as __cpuinit

Comment 4 Subhendu Ghosh 2009-09-16 06:30:42 UTC
Is the reproducer inside the guest or from the host?

Comment 5 lihuang 2009-09-16 06:53:27 UTC
inside kvm  guest(In reply to comment #4)
> Is the reproducer inside the guest or from the host?  

inside kvm guest

Comment 10 Don Zickus 2009-09-22 20:17:31 UTC
in kernel-2.6.18-166.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 16 errata-xmlrpc 2010-03-30 07:42:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.