Bug 554648

Summary: active offline cpu when host is loaded lead to host kernel panic
Product: Red Hat Enterprise Linux 5 Reporter: Suqin Huang <shuang>
Component: kvmAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4.zCC: ehabkost, tburke, virt-maint, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-01-20 09:42:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 510814    
Bug Blocks:    

Description Suqin Huang 2010-01-12 09:15:23 UTC
Description of problem:
active offline cpu when host is loaded lead to host kernel panic

Version-Release number of selected component (if applicable):
2.6.18-164.9.1.el5

How reproducible:
100%

Steps to Reproduce:
1. run vm on host
/usr/libexec/qemu-kvm  -smp 8 -m 32G -drive file=/root/rhel5.4-32.qcow2,media=disk,if=ide,cache=off,index=0,serial=fb-bde1-8bcf10f72b98 -net nic,vlan=0,macaddr=00:1a:4a:01:00:37,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup -uuid `uuidgen` -no-hpet -usbdevice tablet  -rtc-td-hack  -startdate now -cpu qemu64,+sse2  -monitor stdio -boot c -vnc :6

2. offline cpu
 echo 0 >/sys/devices/system/cpu/cpu3/online
3. online cpu
 echo 1 >/sys/devices/system/cpu/cpu3/online
  
Actual results:


Expected results:


Additional info:
a. host work well when offline/online cpu without running guest.
b. host work well on 2.6.18-182.el5 with kvm-83-140
c. offline cpu->online cpu then start guest, also lead to host kernel panic


1. host kernel
2.6.18-164.9.1.el5

2. kvm
kvm-tools-83-105.el5_4.18
kvm-83-105.el5_4.18
etherboot-zroms-kvm-5.4.4-13.el5
etherboot-roms-kvm-5.4.4-13.el5
kvm-debuginfo-83-105.el5_4.18
kvm-qemu-img-83-105.el5_4.18
kmod-kvm-83-105.el5_4.18

3. host cpu
processor       : 3
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : AMD Phenom(tm) 9600B Quad-Core Processor

4. 

Breaking affinity for irq 209
Breaking affinity for irq 225
CPU 3 is now offline
Initializing CPU#3
using mwait in idle threads.
AMD Phenom(tm) 9600B Quad-Core Processor stepping 03
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at ...-83-maint-snapshot-20090205/kernel-/x86/kvm_main.c:2451
invalid opcode: 0000 [1] SMP 
last sysfs file: /class/cpuid/cpu3/dev
CPU 3 
Modules linked in: tun radeon drm ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc bridge ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi cpufreq_ondemand powernow_k8 freq_table dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy ksm(U) kvm_amd(U) kvm(U) snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss sr_mod snd_mixer_oss snd_pcm cdrom snd_timer shpchp snd_page_alloc snd_hwdep i2c_piix4 snd tpm_infineon i2c_core tpm tg3 tpm_bios soundcore serio_raw pcspkr sg dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 4110, comm: qemu-kvm Tainted: G      2.6.18-164.9.1.el5 #1
RIP: 0010:[<ffffffff883b1489>]  [<ffffffff883b1489>] :kvm:kvm_handle_fault_on_reboot+0xb/0x16
RSP: 0018:ffff8101fb951da0  EFLAGS: 00010046
RAX: ffff8101f2167000 RBX: ffff8101f77b63c0 RCX: 00000000c0000101
RDX: ffff8101f2167000 RSI: ffff8101f215e000 RDI: ffff8101f237c040
RBP: ffff8101f237c040 R08: 00000000000000ef R09: 0000000000000000
R10: ffff81020cdc58a8 R11: ffffffff8003c4dd R12: ffff8101f215e000
R13: 0000000000000000 R14: 000000000000000f R15: 0000000000001000
FS:  0000000040b8f940(0063) GS:ffff81021fc1c640(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002aabbd437000 CR3: 00000001fbd77000 CR4: 00000000000006a0
Process qemu-kvm (pid: 4110, threadinfo ffff8101fb950000, task ffff8101f69537e0)
Stack:  ffffffff883eaa66 0000000000630096 ffff8101f77b63c0 ffff8101f237c040
 ffff8101f215e000 0000000000000000 000000000000000f 0000000000001000
 ffffffff883b872e fffffffe7ffbfeff ffff8101f1cacb70 ffff81021f02f910
Call Trace:
 [<ffffffff883eaa66>] :kvm_amd:svm_vcpu_run+0x1b0/0x3a5
 [<ffffffff883b872e>] :kvm:kvm_arch_vcpu_ioctl_run+0x397/0x60b
 [<ffffffff883b4108>] :kvm:kvm_vcpu_ioctl+0xf2/0x45d
 [<ffffffff8008c584>] default_wake_function+0x0/0xe
 [<ffffffff80042143>] do_ioctl+0x21/0x6b
 [<ffffffff800302cf>] vfs_ioctl+0x457/0x4b9
 [<ffffffff8004c804>] sys_ioctl+0x59/0x78
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 82 b8 3c 88 c2 93 09 c3 55 48 89 fd 53 31 db 48 83 
RIP  [<ffffffff883b1489>] :kvm:kvm_handle_fault_on_reboot+0xb/0x16
 RSP <ffff8101fb951da0>
 <0>Kernel panic - not syncing: Fatal exception

Comment 1 Eduardo Habkost 2010-01-12 12:02:37 UTC
From the backtrace, it looks like a KVM module but, not a kernel bug.

Comment 2 Eduardo Habkost 2010-01-12 12:05:54 UTC
Oh, it's 5.4.z, so it's because the patches from bug #510814 aren't included on the kernel package.

Comment 3 Eduardo Habkost 2010-01-12 12:08:29 UTC
Moving back to the kernel component, and adding bug #510814 as dependency.

Comment 4 Dor Laor 2010-01-20 09:42:47 UTC
(In reply to comment #0)

> Additional info:
> a. host work well when offline/online cpu without running guest.
> b. host work well on 2.6.18-182.el5 with kvm-83-140

So is it only a 5.4.z issue? It does not worth fixing.

> c. offline cpu->online cpu then start guest, also lead to host kernel panic
> 
> 
> 1. host kernel
> 2.6.18-164.9.1.el5
> 
> 2. kvm
> kvm-tools-83-105.el5_4.18
> kvm-83-105.el5_4.18
> etherboot-zroms-kvm-5.4.4-13.el5
> etherboot-roms-kvm-5.4.4-13.el5
> kvm-debuginfo-83-105.el5_4.18
> kvm-qemu-img-83-105.el5_4.18
> kmod-kvm-83-105.el5_4.18
> 
> 3. host cpu
> processor       : 3
> vendor_id       : AuthenticAMD
> cpu family      : 16
> model           : 2
> model name      : AMD Phenom(tm) 9600B Quad-Core Processor
> 
> 4. 
> 
> Breaking affinity for irq 209
> Breaking affinity for irq 225
> CPU 3 is now offline
> Initializing CPU#3
> using mwait in idle threads.
> AMD Phenom(tm) 9600B Quad-Core Processor stepping 03
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at ...-83-maint-snapshot-20090205/kernel-/x86/kvm_main.c:2451
> invalid opcode: 0000 [1] SMP 
> last sysfs file: /class/cpuid/cpu3/dev
> CPU 3 
> Modules linked in: tun radeon drm ipt_MASQUERADE iptable_nat ip_nat xt_state
> ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables
> nfsd exportfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth lockd
> sunrpc bridge ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr
> iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q
> libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi
> cpufreq_ondemand powernow_k8 freq_table dm_round_robin dm_multipath scsi_dh
> video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac
> parport_pc lp parport floppy ksm(U) kvm_amd(U) kvm(U) snd_hda_intel
> snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
> sr_mod snd_mixer_oss snd_pcm cdrom snd_timer shpchp snd_page_alloc snd_hwdep
> i2c_piix4 snd tpm_infineon i2c_core tpm tg3 tpm_bios soundcore serio_raw pcspkr
> sg dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero
> dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd
> ehci_hcd
> Pid: 4110, comm: qemu-kvm Tainted: G      2.6.18-164.9.1.el5 #1
> RIP: 0010:[<ffffffff883b1489>]  [<ffffffff883b1489>]
> :kvm:kvm_handle_fault_on_reboot+0xb/0x16
> RSP: 0018:ffff8101fb951da0  EFLAGS: 00010046
> RAX: ffff8101f2167000 RBX: ffff8101f77b63c0 RCX: 00000000c0000101
> RDX: ffff8101f2167000 RSI: ffff8101f215e000 RDI: ffff8101f237c040
> RBP: ffff8101f237c040 R08: 00000000000000ef R09: 0000000000000000
> R10: ffff81020cdc58a8 R11: ffffffff8003c4dd R12: ffff8101f215e000
> R13: 0000000000000000 R14: 000000000000000f R15: 0000000000001000
> FS:  0000000040b8f940(0063) GS:ffff81021fc1c640(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00002aabbd437000 CR3: 00000001fbd77000 CR4: 00000000000006a0
> Process qemu-kvm (pid: 4110, threadinfo ffff8101fb950000, task
> ffff8101f69537e0)
> Stack:  ffffffff883eaa66 0000000000630096 ffff8101f77b63c0 ffff8101f237c040
>  ffff8101f215e000 0000000000000000 000000000000000f 0000000000001000
>  ffffffff883b872e fffffffe7ffbfeff ffff8101f1cacb70 ffff81021f02f910
> Call Trace:
>  [<ffffffff883eaa66>] :kvm_amd:svm_vcpu_run+0x1b0/0x3a5
>  [<ffffffff883b872e>] :kvm:kvm_arch_vcpu_ioctl_run+0x397/0x60b
>  [<ffffffff883b4108>] :kvm:kvm_vcpu_ioctl+0xf2/0x45d
>  [<ffffffff8008c584>] default_wake_function+0x0/0xe
>  [<ffffffff80042143>] do_ioctl+0x21/0x6b
>  [<ffffffff800302cf>] vfs_ioctl+0x457/0x4b9
>  [<ffffffff8004c804>] sys_ioctl+0x59/0x78
>  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> 
> 
> Code: 0f 0b 68 82 b8 3c 88 c2 93 09 c3 55 48 89 fd 53 31 db 48 83 
> RIP  [<ffffffff883b1489>] :kvm:kvm_handle_fault_on_reboot+0xb/0x16
>  RSP <ffff8101fb951da0>
>  <0>Kernel panic - not syncing: Fatal exception