Bug 597144
Summary: | VM reboot automatically when run multi VM which is loaded | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Golita Yue <gyue> | ||||||
Component: | kvm | Assignee: | Andrea Arcangeli <aarcange> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Virtualization Bugs <virt-bugs> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.6 | CC: | aarcange, gcosta, jwest, lihuang, llim, michen, ndai, virt-maint, zamsden | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2011-10-04 14:14:57 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 580949 | ||||||||
Attachments: |
|
Created attachment 417523 [details]
dump file
Is this behaviour exclusive of windows guest? Does it happen in an all-linux scenario? Mixed scenario? Thanks (In reply to comment #3) > Is this behaviour exclusive of windows guest? > > Does it happen in an all-linux scenario? Mixed scenario? > > Thanks I started 6 linux VMs and run about 2 hours, didn't happen reboot. This sounds like a memory corruption or other catastrophic failure, not a kvmclock bug. Indeed. Since it happens in a Windows-only environment, it is highly unlikely that kvmclock plays a role here. do you get any swap on host? can you try to swapoff -a on host? This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. This request was erroneously denied for the current release of Red Hat Enterprise Linux. The error has been fixed and this request has been re-proposed for the current release. CPU_burn-in 30 min on 6VM with 4cpus means some will not get enough CPU time as in real hardware. This may lead to irqs being delivered with delay, if windows7 reboots if an apic irq or nmi arrives late, this doesn't seem a kvm bug but a tweak would be needed in w7 to stop rebooting. A similar scenario would happen by enabling the nmi watchdog with linux guest. There wasn't enough info to debug so I guess we can close it considering also it doesn't seem an obvious kvm bug, we can't give more cpu to guest than what's available on the hardware, some preemption and delays will happen with cpu overcommitting. I'm closing as a notabug for now as it isn't certain this is a kvm bug. The kvm clock has to still try to report the real time even if there are preemption delays hence potentially triggering things like the nmi watchdog, guest should be able to cope with that to be stable. |
Created attachment 417522 [details] debug info Description of problem: When I run 6 VMs on the same host and activate CPU load on them. when the Host CPU reach high utilization (>90%), the three of them reboot automaticlly. And the "unexpected shutdown" information note display after VM reboot. Version-Release number of selected component (if applicable): kvm-83-164.el5_5.9 kernel: 2.6.18-194.3.1.el5 rhev-hypervisor-5.5-2.2.0.16.1 sm69 How reproducible: 1/1 Steps to Reproduce: 1. install win7_x86 from rhev-M 2. make template of win7_x86 3. New 7 VM based on template of win7_x86 4. load host by script for(( I=0; I<`cat /proc/cpuinfo | grep processor | wc -l`;I++)) ; do echo $I; taskset -c $I /bin/bash -c 'for ((;;)); do X=1; done &' ; done 5. select 6 VM by press Shift (rhev-M alert me cannot run the 7th VM, maybe you can run more VM ) 6. press Run button 7. wait VM answer to ping, and run CPU_burn-in 30 min Actual results: Unexpected shutdown occurred and three VM reboot automatically Expected results: All VM can finish CPU_burn-in testing Additional info: debug info; dump file please refer to attachment. cmd: /usr/libexec/qemu-kvm -no-hpet -usb -rtc-td-hack -startdate 2010-05-28T01:22:38 -name win7_nfs_s3 -smp 4,cores=1 -k en-us -m 1024 -boot cd -net nic,vlan=1,macaddr=00:1a:4a:42:41:1c,model=rtl8139 -net tap,vlan=1,ifname=rtl8139_13_1,script=no -drive file=/rhev/data-center/b8a6bc1d-7935-4129-9b7a-483906949cc3/23c959c9-ea7d-4468-b308-f3e1cb04b345/images/3e90f6ee-62bd-40cb-9920-aae369ded9ab/20c245b0-0cdc-40c8-91cb-29f5edf5c8b7,media=disk,if=ide,cache=off,index=0,serial=cb-9920-aae369ded9ab,boot=off,format=qcow2,werror=stop -pidfile /var/vdsm/1ccd51ef-3e04-49fc-8132-912ef93f9090.pid -vnc 0:13,password -cpu qemu64,+sse2,+cx16,+ssse3 -M rhel5.5.0 -notify all -balloon none -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=5.5-2.2-0.16.1,serial=009A33EA-2514-DF11-874D-9EA3C4859730_6c:f0:49:27:33:32,uuid=1ccd51ef-3e04-49fc-8132-912ef93f9090 -vmchannel di:0200,unix:/var/vdsm/1ccd51ef-3e04-49fc-8132-912ef93f9090.guest.socket,server -monitor unix:/var/vdsm/1ccd51ef-3e04-49fc-8132-912ef93f9090.monitor.socket,server Top Result: top - 05:36:32 up 22:47, 1 user, load average: 16.64, 14.99, 13.83 Tasks: 147 total, 7 running, 140 sleeping, 0 stopped, 0 zombie Cpu(s): 3.4%us, 96.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 7758560k total, 5079540k used, 2679020k free, 62980k buffers Swap: 8073208k total, 81116k used, 7992092k free, 1283688k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2504 vdsm 15 0 1251m 1.0g 577m S 96.1 13.8 61:39.26 qemu-kvm 11020 vdsm 15 0 1257m 1.0g 730m S 93.2 13.6 18:13.03 qemu-kvm 2665 vdsm 15 0 1251m 1.0g 652m S 52.5 13.8 64:39.53 qemu-kvm 2136 root 15 0 0 0 0 R 44.0 0.0 16:36.27 kksmd 2344 vdsm 15 0 1251m 1.0g 438m S 36.7 13.8 58:48.14 qemu-kvm 2444 vdsm 15 0 1251m 1.0g 579m R 30.8 13.8 63:02.52 qemu-kvm 2978 root 25 0 8668 532 380 R 17.7 0.0 27:59.13 bash 2972 root 25 0 8668 536 380 R 13.5 0.0 28:24.64 bash 2747 vdsm 15 0 1255m 1.0g 733m S 10.8 13.8 61:43.85 qemu-kvm 2984 root 25 0 8668 532 380 R 3.0 0.0 27:14.06 bash 7085 vdsm 10 -5 520m 15m 3036 S 1.0 0.2 4:46.23 vdsm Host information: cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2659.988 cache size : 3072 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5319.97 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2659.988 cache size : 3072 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5319.94 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2659.988 cache size : 3072 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5320.01 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz stepping : 10 cpu MHz : 2659.988 cache size : 3072 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5319.99 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: